Code Read

Friday, November 18, 2011

Music Notes Flashcards

This was a small project I did to help me teach my brother-in-law to play the piano. View source for the code. The possibly interesting parts: indexing an anonymous associative array, generating a random integer, string concatenation with integers, and halfway-done code factoring.

Monday, August 1, 2011

Bookmarklet: Check if a website is down

Drag this link: down? to your bookmarks toolbar for a button that checks downforeveryoneorjustme.com to see if the website you just failed to load is actually down, or if you just have a problem getting to it.

Here's the code:

(function(){
  document.location = 'http://downforeveryoneorjustme.com/'
    + encodeURIComponent(location.host);
})()

Saturday, July 30, 2011

Bookmarklet: Scroll to the last new Tweet on Twitter

Drag this link: Last New Tweet to your bookmarks toolbar for a bookmarklet that will automatically scroll you to the last new tweet in your Twitter feed.

This would be more useful as a Greasemonkey script, but bookmarklets work for more browsers. Here's the code behind it:

(function() {
  var t = $('.last-new-tweet');
  scroll(0, t.offset().top + t.height() - $(window).height());
})();

There are some interesting things in these few lines of code that are worth pointing out. That $() business is jQuery, a really great JavaScript library for writing compact, powerful, cross-browser JavaScript. I can use it in this bookmarklet because Twitter already uses it, so the library is already loaded. To (perhaps dangerously) oversimplify, $() lets you "select" page elements by class ($('.last-new-tweet')), reference ($(window)), id, and others, giving you a slew of useful functions (offset(), height(), etc) to query and manipulate them.

The other fascinating thing here is the immediate calling of an anonymous function. The basic syntax of this construct is:

(function (){statement;})()

In essence, a function is created, called, and discarded, all in one statement. This is useful when you specifically do not want a return value, or when you need to fit multiple statements into one statement.

Monday, July 18, 2011

Google site search bookmarklet

Drag this link: Search this site to your bookmarks toolbar for an instant site search button. Here's the code, prettified:

function f() {
  var q=prompt('Search for');
  if (q!=null) {
    location.href='http://google.com/search?q=site:'
      + location.host + '+' + encodeURIComponent(q);
  }
}
f();

Saturday, June 18, 2011

Nightly Nmap scans with Ndiff

I was recently playing around with running Nmap scans from a cron job, and I thought I could do it better than I was. Here's what I was doing before:

#m h dom mon dow command
0  3 *   *   *   /usr/local/bin/nmap -v --open -oA /root/nmap/lan-\%y\%m\%d 192.168.1.0/24

So this would run every night at 3am, performing a verbose TCP SYN scan of my network, showing only open ports, and creating output files in all 3 formats (Normal, XML, and Greppable) in the /root/nmap/ directory. Just getting this far presented some challenges, since I was unfamiliar with some aspects of the crontab file format:

Commands run by cron have their environment stripped down for security reasons. Specifically, the PATH variable is set to /usr/bin:/bin, which is pretty restrictive. Since cron just logs that it ran the command, and not the output of the command, I was very confused as to why the logs showed it being run, but no output was generated.
Percent signs (%) are interpreted as newlines by cron. Anything after the first line is passed to the command on STDIN, similar to a here-doc in shell programming. To pass the time format specifiers to Nmap, I needed to escape the percent signs with backslashes.

So this was pretty good, but it left a lot to be desired. To get an idea of what had changed, I needed to manually run an Ndiff on the last two scans. Also, I wasn't taking advantage of Nmap's advanced version detection capabilities. So I decided to automate the diffing process and do a follow-up in-depth scan of new services I detected.

To schedule a complicated job like this, I needed to move the logic out of the crontab and into a shell script. I broke the task down into 3 basic steps:

Scan the network
Perform a diff
Scan new stuff for version information

In order to make it worthwhile to scan things twice, I wanted my first scan to be fast. I decided early to ignore UDP ports, since scanning firewalled hosts for UDP can take hours. I also decided to use a more aggressive timing template. Nmap runs at T3 by default, but since all of my targets are just one hop away, I can easily bump that up to T4. I don't consider T5 to be worth the possible loss in accuracy, but for such a small network, it could have been useful. Finally, since I will only be looking at differences, I don't need all the extra output files, just the XML. Here's the command to do all that:

nmap -v --open -T4 -oX lan-%y%m%d 192.168.1.0/24

Next, I needed to do a diff. Nmap ships with a great tool called Ndiff, which is written in Python. It takes two Nmap XML files and generates a text or XML diff. This was a tricky decision: I wanted to be able to review the diff every morning, so text output would be best for that. But I also wanted to have my script scan all the new hosts and services, which meant parsing the output. Luckily, I have done some development work on Ndiff, so I knew that it would have the whole diff in a data structure before printing it. I just needed to run through it and pull out the new stuff.

Ndiff, like any well-written Python program, consists of a bunch of class and function definitions, and a conditional statement to run the main function if the program is run as a program, not imported as a module. This ensures there are no side-effects if it IS imported, which I planned on doing. I started by making a symlink to the ndiff program in my working directory

ln -s /usr/local/bin/ndiff ndiff.py

I tried using the PYTHONPATH environment variable set to /usr/local/bin, but Ndiff is not installed with a .py extension, so the interpreter complained that it couldn't find the ndiff module. The symlink ends up being the way to go here.

Next, I fired up vim and began my program, ndiffdetails.py.

#!/usr/bin/env python

from ndiff import *

def main():
  pass

if __name__ == "__main__":
  main()

Not a lot of functionality yet. I wanted a similar invocation to the ndiff program itself, so I started by copying the main function from ndiff and stripping out the options I didn't need: help, text, and xml.

def main():
    global verbose
    diffout = "diff.xml"
    cmdout = "nmap-details.sh"

    try:
        opts, input_filenames = getopt.gnu_getopt(sys.argv[1:],
            "hv", ["verbose", "diffout=", "cmdout="])
    except getopt.GetoptError, e:
        usage_error(e.msg)
    for o, a in opts:
        if o == "--diffout":
            diffout = a
        elif o == "--cmdout":
            cmdout = a
        elif o == "-v" or o == "--verbose":
            verbose = True

    if len(input_filenames) != 2:
        usage_error(u"need exactly two input filenames.")

    filename_a = input_filenames[0]
    filename_b = input_filenames[1]

    try:
        scan_a = Scan()
        scan_a.load_from_file(filename_a)
        scan_b = Scan()
        scan_b.load_from_file(filename_b)
    except IOError, e:
        print >> sys.stderr, u"Can't open file: %s" % str(e)
        sys.exit(EXIT_ERROR)

    diff = ScanDiff(scan_a, scan_b)

So at this point, the main function doesn't produce any output. It just creates a ScanDiff object from the two scans. The original ndiff.main function just prints out the text or XML representation of that object, but I wanted more. I wanted a list of new hosts and ports, so that I could generate a shell script to do the details scan. Here's what I wanted the shell script to look like:

OUTFILE=nmap-details
test -z "$1" && OUTFILE=$1
nmap -v -p $PORTS -sV -sC -oA $1 $TARGETS

The first two lines set up a default output filename but let me pass a different one as the first argument ($1). I debated using the -A or -O flags (which would both add Operating System fingerprinting), but since I'm only scanning ports that I know are open, OS fingerprinting wouldn't be as accurate. Nmap needs both open and closed ports to get a complete fingerprint.

Back in ndiffdetails.py, I needed to build a list of targets and ports. Targets would just be a subset of the first scan's results, which would not include duplicates, so I can use a list to hold them. Ports, on the other hand, could show up on multiple targets. I only want to specify each port once, though, so I stored them as keys to a dictionary, which guarantees no duplicates.

    targets = []
    ports = {}

    if diff.cost > 0:
        for host,h_diff in diff.host_diffs.iteritems():
            if h_diff.cost > 0 and h_diff.host_b.state == "up":
                scan_host = False
                for port,p_diff in h_diff.port_diffs.iteritems():
                    if (p_diff.port_a.state != p_diff.port_b.state and
                        p_diff.port_b.state is not None and
                        p_diff.port_b.state.startswith("open")):
                            scan_host = True
                            ports[p_diff.port_b.spec[0]]=1
                if scan_host:
                    targets.append(h_diff.host_b.get_id())

Here's what's happening: ScanDiff and HostDiff objects have a property called cost that tells how many changes it would take to change one object (scan or host) into another. If it's greater than zero, then there is a difference, and I want to scan it, but only if the host is still up in the latest scan, and only if the host has new open ports.

Nearly done with ndiffdetails.py! I just needed to write my two output files: the text-format diff, and the shell script for running the followup scan.

        difffile = open(diffout, 'w')
        diff.print_text(f=difffile)
        difffile.close()

        cmdfile = open(cmdout, 'w')
        cmdfile.write("OUTFILE=nmap-details\n")
        cmdfile.write('test -z "$1" && OUTFILE=$1\n')
        cmdfile.write("/usr/local/bin/nmap -v --open -p %s -sV -oA $1 %s\n"
                % ( ",".join(map(lambda x: str(x), ports.keys())),
                    " ".join(targets)))
        cmdfile.close()

Writing the diff out is straightforward, since that's the original purpose of ndiff. The shell script was also fairly easy, once I remembered to use the absolute path to nmap. The one complexity was getting a comma-separated list of ports. My first attempt used string.join, but here's how that went:

>>> ports = {80:1,443:1}
>>> ",".join(ports.keys())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: sequence item 0: expected string, int found

string.join needs a list of strings, not integers. Using map, I just converted each of the keys to a string, then joined those. I also considered using reduce, like this:

reduce(lambda x,y: str(x)+","+str(y), ports.keys())

I decided that was too complicated, and probably less efficient, due to all the string concatenations and extra calls to str().

So finally, ndiffdetails.py was complete. My last step was to put it all together into a shell script to be called from cron. Here's how that turned out:

#!/bin/sh

NMAP=/usr/local/bin/nmap
NDIFF=/usr/local/bin/ndiff
NMAPOUT=lan-cron-$(date +%F)
NMAPLAST=last-nmap-scan

cd /root/nmap

#Fast port scan
$NMAP -v --open -T4 -oX $NMAPOUT.xml 192.168.1.0/24

#Do diff, generate details-scan command
python ndiffdetails.py --diffout $NMAPOUT.diff --cmdout nmap-details.sh \
    $NMAPLAST $NMAPOUT.xml

#run details scan
sh nmap-details.sh $NMAPOUT-details

#re-point symlink
rm $NMAPLAST
ln -s $NMAPOUT.xml $NMAPLAST

And it works like a charm!

Saturday, June 4, 2011

Trip gas cost calculator

To keep this programming-related, view page source to see how it works.

Distance (miles):

Fuel efficiency (miles per gallon):

Price of gas (dollars per gallon):

Cost of trip :

Saturday, September 4, 2010

Idioms - Using Colorful Language

Pie in the sky. In a New York minute. On the other hand. Costs an arm and a leg. In the black. Mad skills. All Greek to you? To a non-native English speaker, common idioms like these are often challenging, since their meaning is only loosely tied to the words that are used. In the same way, most programming languages have idioms that can look confusing to someone first learning the language, but which are used to perform common tasks.

Perl has many idioms. Here is a common one for "slurping" a file, or reading the entire file into a single variable (rather than the default of reading one line per "readline" call)

{ local $/; $contents = <$filehandle> }

This code introduces a new scope block with curly braces, and declares the Input Record Separator, $/, local to that block, which makes its value undef instead of a newline. Then the <> operator is used to read a "line" of text from the filehandle, which turns out to be the entire contents of the file (from the current position in the file, of course.) The diamond operator itself is rather like an idiom, being a slightly magical way of saying readline $filehandle. The closing curly brace ends the scope block and returns $/ to its previous value.

Here's an idiom from Python that is often used in modules:

if __name__ == "__main__":
    main()

The idea here is to define a special behavior for when the module is used as a script, rather than being imported. When the file is imported, __name__ will be set to the name of the module, and this block will not run. When the file is used like python filename.py, however, the condition will be true, and the main function will be called. This is a convenient way to make a dual-purpose program that can be included as a module or run on its own. It could also be used as a place to put module tests.

C has lots of idioms to choose from, but here's the most recent one I came across:

struct myStruct {
    int num;
    char array[1];
};

struct myStruct *item;
size_t length = 10;
item = (struct myStruct *)
    malloc( sizeof(struct myStruct) + sizeof(int) * (length - 1) );
item->num = 3;
for (i=0; i < length; ++i) {
    item->array[i] = item->num + i;
}

This code isn't portable due to compiler differences, but it definitely works with Microsoft Visual Studio 2005. Essentially, declaring a struct with a one-element array at the end lets you allocate a struct with a variable-length array element. The trick is to never declare an instance of the struct, but instead use pointers and allocate dynamic memory. Since this idiom is fairly common, the C99 standard defined a way to declare flexible array members by leaving out the array length, like so:

struct c99struct {
    int num;
    char array[];
};

This form is guaranteed to work in C99-compliant compilers (a set that does not include Visual Studio 2005).

Just a few idioms to get you started. I find the best way of learning new idioms in a programming language is reading other people's code and looking for parts I don't understand.