Unix Programming

Focused on

Part of the Computer Programming Code Library.

18 November 2008 The Holy Bible Again

This is a program which will make web pages for the entire Bible (provided you have typed up a properly formatted copy of the Bible).

sub readbible {
    my @verses;
    my $oldbook, $oldchapter;
    open(local *F, '<', 'bible.txt');
    while (<F>) {
	my ($book, $chapter, $verse, $text) = m/(\S.*?\S)\s+(\d+):(\d+)\s+(.*\S)/;
	if ($book ne $oldbook || $chapter != $oldchapter) {
	    savechapter($oldbook, $oldchapter, \@verses);
	    @verses = ();
	}
	$oldbook = $book;
	$oldchapter = $chapter;
	push(@verses, $text);
	if (rand() < .0001) {
	    print "book=$book\nchapter=$chapter\nverse=$verse\nchapter=$chapter\n\n";
	}
    }
}

readbible();

sub savechapter {
    my ($book, $chapter, $verses) = @_;
    my $dirname = $book;
    $dirname =~ s/[^\w]+/-/g;
    return unless length $dirname;
    unless (-d $dirname) {
	mkdir("$dirname") || die "Trying to make $dirname: $!";
    }
    open(local *C, '>', "$dirname/$chapter.htm") || die $!;
    my $htmlverses = join("\n", map("<li>$_</li>", @$verses));
    print C <<A;
<title>Holy Bible, $book Chapter $chapter</title>
<meta name="description" content="$verses->[0]" />
<h1>Holy Bible, $book Chapter $chapter</h1>
<ol>
$htmlverses
</ol>
<h2>Link to this chapter of the Holy Bible.</h2>
<pre>
&lt;a href="http://togod.us/holybible/$dirname/$chapter.htm">
Holy Bible, $book Chapter $chapter
&lt;/a>
</pre>
<hr />
<address><a href="http://www.josephmyers.com/">Joseph Myers</a>, editor</address>
A
    close(C);
}

18 November 2008 The Holy Bible

This program will pick some random Bible verses:

sub readbible {
    open(local *F, '<', 'bible.txt');
    while (<F>) {
        my ($book, $chapter, $verse, $text) = m/(\S.*?\S)\s+(\d+):(\d+)\s+(.*\S)/;
        if (rand() < .0001) {
            print "book=$book\nchapter=$chapter\nverse=$verse\nchapter=$chapter\n\n";
        }
    }
    return;
}

readbible();

Here is some example output.

book=Genesis
chapter=41
verse=8
chapter=41

book=Leviticus
chapter=16
verse=12
chapter=16

book=Psalms
chapter=22
verse=11
chapter=22

book=Matthew
chapter=21
verse=26
chapter=21


real0m0.167s
user0m0.157s
sys0m0.010s

18 November 2008 How Hot Is Mathopd?

Today I will download and test mathopd-1.5p6 and compare it to Apache 2.2.10.

I extract the mathopd tarball.

cd src
# Add/change these lines in the Makefile
# CFLAGS = -Os -Wall
# CPPFLAGS = -DHAVE_CRYPT_H
# CPPFLAGS += -DLINUX_SENDFILE
# EXTRA_OBJS += sendfile.o

13 November 2008 Separating Matches from Other Lines

I want to place all of the lines of a file which match something at the top, and those which don't at the bottom. How do I do this?

Solution

#!/usr/bin/perl
# matchtop.pl
my $expr = shift;
@_ = <STDIN>;
my @yes, @no;
for (@_) {
    if (m/$expr/i) {
        push(@yes, $_);
    } else {
        push(@no, $_);
    }
}
print @yes, @no;

10 Nov 2008 Performance Enhancing Regular Expressions

Sometimes one just needs to get rid of regular expressions completely.

For example, I previously used the top line to find the host name from my log file entries.

# m/(\S+)\s*$/; my $host = $1;
my $host = substr($_, rindex($_, ' ')+1, -1);

I replaced this with the bottom line. Guess how big of a difference it made?

# old version
real    0m3.259s
user    0m3.155s
sys     0m0.031s

# new version
real    0m0.630s
user    0m0.588s
sys     0m0.045s

Both of these results were from running the following command, which processes 100,000 log entries.

jmyers@lilly(~/uspremium)$ time perl ~/access-myersdaily/p < access.100000 | tail
...

I use tail so that I don't have to see 8,000 rankings, but so that the program will still process all the data. If I asked for "head" then the program would stop running after processing only a small portion of the data. (It would soon receive a SIGPIPE when head received more than ten lines, and then it would terminate.)

6 Nov 2008 Submitting Domains for Sale to Sedo

I have an account with Sedo as an agent of U.S. Premium Domain Services. One of my jobs today is to upload about 700 domains for sale to the uspremium Sedo account.

First, I downloaded a semicolon-separated-value spreadsheet file (really, a text file) of all the currently offered domains at Sedo.

Domain name;Language;Currency;Price;Minimum Offer;Price Option;Category 1;Category 2;Category 3;Current Keyword;Unique Views;Clicks;Earnings Per Click;Click Rate;Revenue;Inserted;Views Of Offer;Offers;Portfolio

An example of a data line is this.

deliverer.us;English (13);;1000;;Fixed price (2);;;;deliverer;1;0;0,00000000;0,0000;0,0000;;0;;

Here's what I do in order to extract the domain names.

cut -d ';' -f 1 < 1269524_20081105_174404.csv > o.1

I will have to write a Perl script (bother, I've already done it before, but I don't remember where it is--that's why this web page of tips and hints is for me as well as for you).

#!/usr/bin/perl
# newitems
# usage: newitems previouslist < updatedlist
# previouslist is a file containing an previous list of items
# updated list is a file containing an updated list of items
# the result is items contained in the updated list which
# weren't in the previous list

my $f1 = shift;
open(local *F, '<', $f1);
chomp(my @prev = <F>);
close(F);
my %prev = map { ($_, 1) } @prev;
chomp(my @new = <STDIN>);
for (@new) {
print $_, "\n" unless $prev{$_};
}

Now I retrive new items from my agent's assigned list of domains. (cd ~/usp/inventory; cut -b 14- < list.txt) | newitems ~jkm/b/o.1 >> newitems.txt

Then I upload them to Sedo 50 at a time.

30 Oct 2008 Ubuntu 8.10

Today we will install and evaluate Ubuntu 8.10. On this computer we already have Linux Mint installed--far better than any previous experience I have had with an Ubuntu-related operating system. Let's see if Ubuntu 8.10 can beat it. Of course, we will start by installing GAG, which enables easy selection from multiple operating systems on the same hard drive, including more operating systems which may be installed in the future.

We'll install GAG 4.10, and maybe also PC-BSD 7.0.1, which, actually hasn't been installed yet, due to a Gigabyte motherboard which I couldn't find out how to persuade to boot from the CD-ROM drive, no matter what I did. Thus it always starts up Mandriva 2009 One, which is the last thing I installed. That computer is a quad-core AMD Phenom 9750. This computer is a dual-core AMD 5600+ with 2.8 GHz and 2x1 MB L2 cache (as opposed to the version with 2.9 GHz and 2x512 KB L2 cache). Actually, this computer (the one I'm typing on) is FreeBSD 7.0 with Dell PowerEdge SC440 / Intel 1.8 GHz Pentium dual-core. It only has an RGB monitor plug, which is convenient, because I just push the "source" button on my Samsung 2253 BW 22" monitor, and it switches to the other computer, which has digital (DVI) input. [Really, emacs is so stupid to turn my HTML green after I type a quotation mark. But I won't let that Microsoft Word-type dumbness keep me from saying 22".]

Well, actually, GAG didn't work. Linux Mint would not boot from GAG, so I decided to install PC-BSD 7.0.1 instead. It took a few minutes to boot from the DVD and verify the installation and select my desired options: all but FileZilla, KDEEdu (educational software), KTorrent, KVirc, Pidgin, and source code. The installation took about 8 minutes after pressing Install.

Now I guess that there is an error in the Ubuntu installation CD that I burned (i.e., made, i.e,. recorded).

PC-BSD is working beautifully... until I changed the graphics driver to radeonhd. Now the Xorg process uses almost all of the CPU and everything else is like a 20-year-old computer, slow to respond and hard of hearing. There is not a working method to change the driver in PC-BSD--the system freezes, and one has to reboot it. Then after rebooting the configuration of X11 resumes. This wastes a lot of time, because the configuration of X11 should just happen, not a system freeze and reboot in the middle of it. (Note: the system freeze happens before the configuration program even starts up.)

22 Oct 2008 PC-BSD 7.0.1 (Based on Free BSD 7.1 Prerelease)

Today I will evaluate, or at least install the new release of PC-BSD: ftp://mirrors.isc.org/pub/pcbsd/7.0.1/i386/PCBSD7.0.1-x86-DVD.iso.

17 Oct 2008 Performance Speed Test!

Not only the computer hardware, but the operating system and the installed software components are measured by the results of this test!

Today I tried my new Quad-Core supercomputer with an AMD X4 9750 processor. I downloaded the classic libjpeg source code and timed how long it took to compile in parallel using make -j 4 with the configuration CFLAGS=-Os. It took two seconds flat.

So I came home, and with my brother, John, I compiled the same source code on a 1998/1998 model grape iMac 333MHz computer. (The original huge (tiny) 6 GB hard drive was replaced with a 40 GB ATA Seagate hard drive way back in 2002, along with a fresh installation of Mac OS X Jaguar 10.2, which is now 10.2.8.)

The shell output of zsh from compiling libjpeg results in the following data:

#1
Download the file: jpegsrc.v6b.tar.gz

#2
Use zsh to obtain detailed timings.

#3
Extract it
  zcat /Volumes/Programs/Archives/b/old/jpegsrc.v6b.tar.gz
0.14s user 0.06s system 24% cpu 0.807 total
  tar -xf -
0.04s user 0.28s system 44% cpu 0.722 total

#4
  CFLAGS=-Os ./configure
1.95s user 5.79s system 58% cpu 13.201 total

#5
  make -j 4 
55.85s user 13.25s system 85% cpu 1:20.74 total

Wow! So the old iMac is able to compile the code, albeit 40 times more slowly.

11 Oct 2008

Hello! Today I'm writing about my experience with Mandriva Linux 2009 (Version One). I'm not going to cover installation, because others can do a better job of that. Suffice to say, I installed it on a new 500 GB hard drive, using about 2.7 GB of an initial 50 GB partition. I'm looking forward to installing more operating systems and testing them on this AMD Phenom Quad Core 9750 2.4 GHz system!

The first thing I'm going to do is add software. I'm disappointed that apt-get is not available. I go to Menu / Install & Remove Software. I'm prompted for the root password that I set up during installation. Next, I click No, I don't want to add media sources now. I start with System because I would like to install apt-get, so I can install everything else via the command line interface (CLI). Besides apt-get, my installation goals include some of my favorite and most-used software programs:

Unfortunately, Software Management cannot find any new programs to install! How dumb. I have to go to Configure media / Add / Full set of sources. Now, under System / Configuration / Packaging, I can find apt, "Debian's Advanced..." version 0.5.15lorg3.2, Release 7mdv2009.0. I check the box next to it and then some changes need to be applied, then I click the final Apply button.

Now I type "su" and enter my root password. Then, as a start, I type

apt-get install emacs-nox latex opera

Bother, it does not know what emacs-nox is, so will change it to just emacs. Yuck, it didn't even know what emacs was! It obviously doesn't work. Maybe I'll just have to go back to the graphical Software Management system and install things from there. Horrors.

Well, I can intall them using the phooey GUI, except for Opera, which it seems is not available for Mandriva 2009 yet.

By the way, when I'm running Mandriva Update, I get the message to use "urpme --auto-orphans" to remove an orphan package.

It's irritating, but cjpeg and djpeg are in jpeg-progs instead of with the libjpeg that's already installed. So when I'm trying to change a TIFF file to JPEG before uploading to Shutterfly, I have a problem. But it's great that anytopnm is installed. That was a very nice positive surprise. (Amazon gave me a free 8 x 8 photobook from Shutterfly.)

04 Oct 2008

This is how to add numbers in comma/tab-delimitd text files. For example, in my case I am wanting to get a total of the orders received in an email "blast".

((echo print ' '; tail -120 allsales.txt | grep '0108.ab' | grep -v Myers | cut -f 53) | perl -i -pe 's|\n|+|g'; echo 0 ',"\n"') | perl

27 Sep 2008

How to send out a batch of emails to customers.

  1. Randomize an email input file before creating any batches of emails.
  2. Select a group of 2,000 emails for each batch, for instance, batch nine.
    mkdir batch9
    head -18000 email-input.txt | tail -2000 > batch9/emails.txt
  3. Update the msgs.pl script with a new email ID number.
    e msgs.pl
  4. Generate the emails.
    cd batch9
    perl ../msgs.pl ../email.mime < emails.txt
  5. Test a random email for appearance and functionality of links. (Preferably also email one to yourself at several common email service providers like Yahoo! and Hotmail.)
    cp msgs/505.mime ~/www/codelib.net/505.html
    (View this file with your browser to test.)
  6. Archive the emails in batch9.
    tar -czf ~/www/codelib.net/batch9.tgz batch9
  7. Connect to server and then business host.
    lilly -> ssh to lilly
    @lilly
    pp -> ssh to providenceproject.com
  8. Download the emails.
    wget http://codelib.net/batch9.tgz
  9. Extract.
    tar -xzf batch9.tgz
  10. Send.
    cd msgs
    (sh ~/bin/send.sh &) 2>&1 >> log

16 Sep 2008

#!/usr/bin/perl
# any2lf
# converts input from Macintosh (CR) or Windows (CRLF) format
# into Unix canonical linebreaks (LF).
while (<>) { s/\r\n|\r/\n/g; print; }

I install this program on all my systems due to the frequency of receiving files from other operating systems. It simplifies processing so much--and I had done the command line version perl -i -pe 's/\r\n|\r/\n/g' so often that my fingers were very tired!

An example of when I used it just happened today. I downloaded a privately uploaded file from DropOff.us consisting of comma separated values (CSV) from FileMaker Pro. However, it was in old-fashioned style Macintosh text, so I used any2lf in order to convert it to LF and use the data to send out a list of emails.

18 Sep 2008

"Mrs.","Holly","Smith","holly@smith.com"

How do I send out an email advertisement (HTML version of email advertisement) from a file containing these values, formatted as above?

First, I randomize the input so that I can send out batches and have the results from each batch be an accurate random sample.

# shuffle.pl
@lines = <STDIN>;
print sort { rand() <=> .5 } @lines;

I use the program by doing

perl shuffle.pl < emails.txt > emails-random.txt

Next I create a folder called "batch1" and I input the first 500 emails.

head -500 emails-random.txt > batch1/list.txt

I cd into the batch1 directory, and then I run the following perl script with the arguments perl ../msgs.pl ../email.mime < list.txt

#!/usr/bin/perl -I/home/j/apache228_4-14-08/www/uspremium.us
# msgs.pl

use easy;

my $xid = 1;

system('mkdir msgs') unless -d 'msgs';

#"Mrs.","Becky","Adrian","nbadrian@insightbb.com"
my $filename = shift;
$file = template($filename) || die "no file found in $filename\n";
while (<>) {
my ($title, $name1, $name2, $email) = m/"(.*?)"/g;
my $txt = $file;
$txt =~ s/xemail/"$name1 $name2" <$email>/;
$txt =~ s/xdear/$title $name2/g;
$txt =~ s/xid/$xid/g;
outputfile("msgs/$..mime", $txt);
}

sub outputfile {
    my $f = shift;
    my $s = shift;
    open(local *FH, '>', $f);
    print FH $s;
    close(FH);
}

Of course, the easy.pm Perl script/pragma/module is the extremely useful module written in 2000, almost ten years ago. It's compatible with Perl back to version 5.004, and it runs on Macintosh, even the classic version back to 7.6 under MacPerl, Windows, Linux, and Unix.

At last, I upload the newly created "msgs" folder to my server, containing 500 pre-made email messages. After extracting the tarball that I uploaded, I go inside of it, and run the following shell script sh ~/bin/send.sh.

# send.sh
sendmail=/usr/sbin/sendmail
for a in *.mime; do
echo $a
if $sendmail -t -oi < $a; then
rm $a
sleep 1
fi
done

This has the benefit that any unsuccessful emails are left over after the other messages have been sent. Also, the program can be safely termianted at any time, and then restarted again at the same point by repeating the original command sh ~/bin/send.sh.