Focused on
Part of the Computer Programming Code Library.
This is a program which will make web pages for the entire Bible (provided you have typed up a properly formatted copy of the Bible).
sub readbible {
my @verses;
my $oldbook, $oldchapter;
open(local *F, '<', 'bible.txt');
while (<F>) {
my ($book, $chapter, $verse, $text) = m/(\S.*?\S)\s+(\d+):(\d+)\s+(.*\S)/;
if ($book ne $oldbook || $chapter != $oldchapter) {
savechapter($oldbook, $oldchapter, \@verses);
@verses = ();
}
$oldbook = $book;
$oldchapter = $chapter;
push(@verses, $text);
if (rand() < .0001) {
print "book=$book\nchapter=$chapter\nverse=$verse\nchapter=$chapter\n\n";
}
}
}
readbible();
sub savechapter {
my ($book, $chapter, $verses) = @_;
my $dirname = $book;
$dirname =~ s/[^\w]+/-/g;
return unless length $dirname;
unless (-d $dirname) {
mkdir("$dirname") || die "Trying to make $dirname: $!";
}
open(local *C, '>', "$dirname/$chapter.htm") || die $!;
my $htmlverses = join("\n", map("<li>$_</li>", @$verses));
print C <<A;
<title>Holy Bible, $book Chapter $chapter</title>
<meta name="description" content="$verses->[0]" />
<h1>Holy Bible, $book Chapter $chapter</h1>
<ol>
$htmlverses
</ol>
<h2>Link to this chapter of the Holy Bible.</h2>
<pre>
<a href="http://togod.us/holybible/$dirname/$chapter.htm">
Holy Bible, $book Chapter $chapter
</a>
</pre>
<hr />
<address><a href="http://www.josephmyers.com/">Joseph Myers</a>, editor</address>
A
close(C);
}
This program will pick some random Bible verses:
sub readbible {
open(local *F, '<', 'bible.txt');
while (<F>) {
my ($book, $chapter, $verse, $text) = m/(\S.*?\S)\s+(\d+):(\d+)\s+(.*\S)/;
if (rand() < .0001) {
print "book=$book\nchapter=$chapter\nverse=$verse\nchapter=$chapter\n\n";
}
}
return;
}
readbible();
Here is some example output.
book=Genesis chapter=41 verse=8 chapter=41 book=Leviticus chapter=16 verse=12 chapter=16 book=Psalms chapter=22 verse=11 chapter=22 book=Matthew chapter=21 verse=26 chapter=21 real0m0.167s user0m0.157s sys0m0.010s
Today I will download and test mathopd-1.5p6 and compare it to Apache 2.2.10.
I extract the mathopd tarball.
cd src # Add/change these lines in the Makefile # CFLAGS = -Os -Wall # CPPFLAGS = -DHAVE_CRYPT_H # CPPFLAGS += -DLINUX_SENDFILE # EXTRA_OBJS += sendfile.o
I want to place all of the lines of a file which match something at the top, and those which don't at the bottom. How do I do this?
Solution
#!/usr/bin/perl
# matchtop.pl
my $expr = shift;
@_ = <STDIN>;
my @yes, @no;
for (@_) {
if (m/$expr/i) {
push(@yes, $_);
} else {
push(@no, $_);
}
}
print @yes, @no;
Sometimes one just needs to get rid of regular expressions completely.
For example, I previously used the top line to find the host name from my log file entries.
# m/(\S+)\s*$/; my $host = $1; my $host = substr($_, rindex($_, ' ')+1, -1);
I replaced this with the bottom line. Guess how big of a difference it made?
# old version real 0m3.259s user 0m3.155s sys 0m0.031s # new version real 0m0.630s user 0m0.588s sys 0m0.045s
Both of these results were from running the following command, which processes 100,000 log entries.
jmyers@lilly(~/uspremium)$ time perl ~/access-myersdaily/p < access.100000 | tail ...
I use tail so that I don't have to see 8,000 rankings, but so that the program will still process all the data. If I asked for "head" then the program would stop running after processing only a small portion of the data. (It would soon receive a SIGPIPE when head received more than ten lines, and then it would terminate.)
I have an account with Sedo as an agent of U.S. Premium Domain Services. One of my jobs today is to upload about 700 domains for sale to the uspremium Sedo account.
First, I downloaded a semicolon-separated-value spreadsheet file (really, a text file) of all the currently offered domains at Sedo.
Domain name;Language;Currency;Price;Minimum Offer;Price Option;Category 1;Category 2;Category 3;Current Keyword;Unique Views;Clicks;Earnings Per Click;Click Rate;Revenue;Inserted;Views Of Offer;Offers;Portfolio
An example of a data line is this.
deliverer.us;English (13);;1000;;Fixed price (2);;;;deliverer;1;0;0,00000000;0,0000;0,0000;;0;;
Here's what I do in order to extract the domain names.
cut -d ';' -f 1 < 1269524_20081105_174404.csv > o.1
I will have to write a Perl script (bother, I've already done it before, but I don't remember where it is--that's why this web page of tips and hints is for me as well as for you).
#!/usr/bin/perl
# newitems
# usage: newitems previouslist < updatedlist
# previouslist is a file containing an previous list of items
# updated list is a file containing an updated list of items
# the result is items contained in the updated list which
# weren't in the previous list
my $f1 = shift;
open(local *F, '<', $f1);
chomp(my @prev = <F>);
close(F);
my %prev = map { ($_, 1) } @prev;
chomp(my @new = <STDIN>);
for (@new) {
print $_, "\n" unless $prev{$_};
}
Now I retrive new items from my agent's assigned list of domains.
(cd ~/usp/inventory; cut -b 14- < list.txt) | newitems ~jkm/b/o.1 >> newitems.txt
Then I upload them to Sedo 50 at a time.
Today we will install and evaluate Ubuntu 8.10. On this computer we already have Linux Mint installed--far better than any previous experience I have had with an Ubuntu-related operating system. Let's see if Ubuntu 8.10 can beat it. Of course, we will start by installing GAG, which enables easy selection from multiple operating systems on the same hard drive, including more operating systems which may be installed in the future.
We'll install GAG 4.10, and maybe also PC-BSD 7.0.1, which, actually hasn't been installed yet, due to a Gigabyte motherboard which I couldn't find out how to persuade to boot from the CD-ROM drive, no matter what I did. Thus it always starts up Mandriva 2009 One, which is the last thing I installed. That computer is a quad-core AMD Phenom 9750. This computer is a dual-core AMD 5600+ with 2.8 GHz and 2x1 MB L2 cache (as opposed to the version with 2.9 GHz and 2x512 KB L2 cache). Actually, this computer (the one I'm typing on) is FreeBSD 7.0 with Dell PowerEdge SC440 / Intel 1.8 GHz Pentium dual-core. It only has an RGB monitor plug, which is convenient, because I just push the "source" button on my Samsung 2253 BW 22" monitor, and it switches to the other computer, which has digital (DVI) input. [Really, emacs is so stupid to turn my HTML green after I type a quotation mark. But I won't let that Microsoft Word-type dumbness keep me from saying 22".]
Well, actually, GAG didn't work. Linux Mint would not boot from GAG, so I decided to install PC-BSD 7.0.1 instead. It took a few minutes to boot from the DVD and verify the installation and select my desired options: all but FileZilla, KDEEdu (educational software), KTorrent, KVirc, Pidgin, and source code. The installation took about 8 minutes after pressing Install.
Now I guess that there is an error in the Ubuntu installation CD that I burned (i.e., made, i.e,. recorded).
PC-BSD is working beautifully... until I changed the graphics driver to radeonhd. Now the Xorg process uses almost all of the CPU and everything else is like a 20-year-old computer, slow to respond and hard of hearing. There is not a working method to change the driver in PC-BSD--the system freezes, and one has to reboot it. Then after rebooting the configuration of X11 resumes. This wastes a lot of time, because the configuration of X11 should just happen, not a system freeze and reboot in the middle of it. (Note: the system freeze happens before the configuration program even starts up.)
Today I will evaluate, or at least install the new release of
PC-BSD:
ftp://mirrors.isc.org/pub/pcbsd/7.0.1/i386/PCBSD7.0.1-x86-DVD.iso.
Not only the computer hardware, but the operating system and the installed software components are measured by the results of this test!
Today I tried my new Quad-Core supercomputer with an AMD X4 9750 processor. I downloaded the classic libjpeg source code and timed how long it took to compile in parallel using make -j 4 with the configuration CFLAGS=-Os. It took two seconds flat.
So I came home, and with my brother, John, I compiled the same source code on a 1998/1998 model grape iMac 333MHz computer. (The original huge (tiny) 6 GB hard drive was replaced with a 40 GB ATA Seagate hard drive way back in 2002, along with a fresh installation of Mac OS X Jaguar 10.2, which is now 10.2.8.)
The shell output of zsh from compiling libjpeg results in the following data:
#1 Download the file: jpegsrc.v6b.tar.gz #2 Use zsh to obtain detailed timings. #3 Extract it zcat /Volumes/Programs/Archives/b/old/jpegsrc.v6b.tar.gz 0.14s user 0.06s system 24% cpu 0.807 total tar -xf - 0.04s user 0.28s system 44% cpu 0.722 total #4 CFLAGS=-Os ./configure 1.95s user 5.79s system 58% cpu 13.201 total #5 make -j 4 55.85s user 13.25s system 85% cpu 1:20.74 total
Wow! So the old iMac is able to compile the code, albeit 40 times more slowly.
Hello! Today I'm writing about my experience with Mandriva Linux 2009 (Version One). I'm not going to cover installation, because others can do a better job of that. Suffice to say, I installed it on a new 500 GB hard drive, using about 2.7 GB of an initial 50 GB partition. I'm looking forward to installing more operating systems and testing them on this AMD Phenom Quad Core 9750 2.4 GHz system!
The first thing I'm going to do is add software. I'm disappointed that apt-get is not available. I go to Menu / Install & Remove Software. I'm prompted for the root password that I set up during installation. Next, I click No, I don't want to add media sources now. I start with System because I would like to install apt-get, so I can install everything else via the command line interface (CLI). Besides apt-get, my installation goals include some of my favorite and most-used software programs:
Unfortunately, Software Management cannot find any new programs to install! How dumb. I have to go to Configure media / Add / Full set of sources. Now, under System / Configuration / Packaging, I can find apt, "Debian's Advanced..." version 0.5.15lorg3.2, Release 7mdv2009.0. I check the box next to it and then some changes need to be applied, then I click the final Apply button.
Now I type "su" and enter my root password. Then, as a start, I type
apt-get install emacs-nox latex opera
Bother, it does not know what emacs-nox is, so will change it to just emacs. Yuck, it didn't even know what emacs was! It obviously doesn't work. Maybe I'll just have to go back to the graphical Software Management system and install things from there. Horrors.
Well, I can intall them using the phooey GUI, except for Opera, which it seems is not available for Mandriva 2009 yet.
By the way, when I'm running Mandriva Update, I get the message to use "urpme --auto-orphans" to remove an orphan package.
It's irritating, but cjpeg and djpeg are in jpeg-progs instead of with the libjpeg that's already installed. So when I'm trying to change a TIFF file to JPEG before uploading to Shutterfly, I have a problem. But it's great that anytopnm is installed. That was a very nice positive surprise. (Amazon gave me a free 8 x 8 photobook from Shutterfly.)
This is how to add numbers in comma/tab-delimitd text files. For example, in my case I am wanting to get a total of the orders received in an email "blast".
((echo print ' '; tail -120 allsales.txt | grep '0108.ab' | grep -v Myers | cut -f 53) | perl -i -pe 's|\n|+|g'; echo 0 ',"\n"') | perl
How to send out a batch of emails to customers.
mkdir batch9 head -18000 email-input.txt | tail -2000 > batch9/emails.txt
e msgs.pl
cd batch9 perl ../msgs.pl ../email.mime < emails.txt
cp msgs/505.mime ~/www/codelib.net/505.html(View this file with your browser to test.)
tar -czf ~/www/codelib.net/batch9.tgz batch9
lilly -> ssh to lilly @lilly pp -> ssh to providenceproject.com
wget http://codelib.net/batch9.tgz
tar -xzf batch9.tgz
cd msgs (sh ~/bin/send.sh &) 2>&1 >> log
#!/usr/bin/perl
# any2lf
# converts input from Macintosh (CR) or Windows (CRLF) format
# into Unix canonical linebreaks (LF).
while (<>) { s/\r\n|\r/\n/g; print; }
I install this program on all my systems due to the frequency
of receiving files from other operating systems. It simplifies
processing so much--and I had done the command line version
perl -i -pe 's/\r\n|\r/\n/g'
so often that my fingers were very tired!
An example of when I used it just happened today.
I downloaded a privately uploaded file from
DropOff.us consisting of
comma separated values (CSV) from FileMaker Pro.
However, it was in old-fashioned style Macintosh text,
so I used any2lf in order to convert it
to LF and use the data to send out a list of emails.
"Mrs.","Holly","Smith","holly@smith.com"
How do I send out an email advertisement (HTML version of email advertisement) from a file containing these values, formatted as above?
First, I randomize the input so that I can send out batches and have the results from each batch be an accurate random sample.
# shuffle.pl
@lines = <STDIN>;
print sort { rand() <=> .5 } @lines;
I use the program by doing
perl shuffle.pl < emails.txt > emails-random.txt
Next I create a folder called "batch1" and I input the first 500 emails.
head -500 emails-random.txt > batch1/list.txt
I cd into the batch1 directory, and
then I run the following perl script with the
arguments perl ../msgs.pl ../email.mime < list.txt
#!/usr/bin/perl -I/home/j/apache228_4-14-08/www/uspremium.us
# msgs.pl
use easy;
my $xid = 1;
system('mkdir msgs') unless -d 'msgs';
#"Mrs.","Becky","Adrian","nbadrian@insightbb.com"
my $filename = shift;
$file = template($filename) || die "no file found in $filename\n";
while (<>) {
my ($title, $name1, $name2, $email) = m/"(.*?)"/g;
my $txt = $file;
$txt =~ s/xemail/"$name1 $name2" <$email>/;
$txt =~ s/xdear/$title $name2/g;
$txt =~ s/xid/$xid/g;
outputfile("msgs/$..mime", $txt);
}
sub outputfile {
my $f = shift;
my $s = shift;
open(local *FH, '>', $f);
print FH $s;
close(FH);
}
Of course, the easy.pm Perl script/pragma/module is the extremely useful module written in 2000, almost ten years ago. It's compatible with Perl back to version 5.004, and it runs on Macintosh, even the classic version back to 7.6 under MacPerl, Windows, Linux, and Unix.
At last, I upload the newly created "msgs" folder to
my server, containing 500 pre-made email messages.
After extracting the tarball that I uploaded, I
go inside of it, and run the following shell
script sh ~/bin/send.sh.
# send.sh sendmail=/usr/sbin/sendmail for a in *.mime; do echo $a if $sendmail -t -oi < $a; then rm $a sleep 1 fi done
This has the benefit that any unsuccessful
emails are left over after the other messages
have been sent. Also, the program can be
safely termianted at any time, and
then restarted again at the same point by
repeating the original command
sh ~/bin/send.sh.