Thursday, July 28, 2005

Quick black box testing example

There's an ongoing debate on the agile-testing mailing list on whether it's better to have a 'black box' or a 'white box' view into the system under test. Some are of the opinion that black boxes are easier to test, while others (Ron Jeffries in particular) say that one would like to 'open up' one's boxes, especially in an agile environment. I suspect that the answer, as always, is somewhere in the middle -- both white-box and black-box testing are critical and valuable in their own right.

I think that it's in combining both types of tests that developers and testers will find the confidence that the software under test is stable and relatively free of bugs. Developers do white-box testing via unit tests, while testers do mostly black-box testing (or maybe gray-box, since they usually do have some insight into the inner workings of the application) via functional, integration and system testing. Let's not forget load/performance/stress testing too...They too can be viewed as white-box (mostly in the case of performance testing) vs. black-box (load/stress testing), as I wrote in a previous post.

I want to include in this post my answer to a little example posted by Ron Jeffries. Here's what he wrote:

Let's explore a simple example. Suppose we have an application that includes an interface (method) whose purpose is to find a "matching" record in a collection, if one exists. If none exists, the method is to return null.

The collection is large. Some users of this method have partial knowledge of the collection's order, so that they know that the record they want, if it is in there at all, occurs at or after some integer index in the collection.

So the method accepts a value, let's say a string /find/, to match the record on, and an integer /hint/, to be used as a hint to start the search. The first record in the table is numbered zero. The largest meaningful /hint/ value is therefore N-1, where N is the number of records in the table.

We want the search to always find a record if one exists, so that if /hint/ is wrong, but /find/ is in some record, we must still return a matching record, not null.

Now then. Assuming a black box, what questions do we want to ask, what tests do we want to write, against our method

public record search(string find, int hint)?

And here's my answer:

I'll take a quick stab at it. Here's what I'd start by doing (emphasis on start):

1. Generate various data sets to run the 'search' method against.

1a. Vary the number of items in the collection: create collections with 0, 1, 10, 100, 1000, 10000, 100000, 1 million items for starters; it may be the case that we hit an operating system limit at some point, for example if the items are files in the same directory (ever done an ls only to get back a message like "too many arguments"?)

1b. For each collection in 1a., generate several orderings: increasing order, decreasing order, random, maybe some other statistical distributions.

1c. Vary the length of the names of the items in the collection: create collections with 0, 1, 10, 100, 1000 items, where the names of the items are generated randomly with lengths between 1 and 1000 (arbitrary limit, which may change as we progress testing).

1d. Generate item names with 'weird' characters (especially /, \, :, ; -- since they tend to be used as separators by the OS).

1e. Generate item names that are Unicode strings.

2. Run (and time) the 'search' method against the various collections generated in 1. Make sure you cover cases such as:

2a. The item we search for is not in the collection: verify that the search method returns Null.

2b. The item we search for is in position p, where p can be 0, N/2, N-1, N.

2c. For each case in 2b, specify a hint of 0, p-1, p, p+1, N-1: verify that in all combinations of 2b and 2c, the search method returns the item in position p.

2d. Investigate the effect of item naming on the search. Does the search method work correctly when item names keep getting longer? When the item names contain 'weird' or Unicode characters?

2e. Graph the running time of the search method against collection size, when the item is or is not in the collection (so you generate 2 graphs). See if there is any anomaly.

2f. Run the tests in 2a-2d in a loop, to see if the search method produces a memory leak.

2g. Monitor various OS parameters (via top, vmstat, Windows PerfMon) to see how well-behaved the search functionality is in regards to the resources on that machine.

2h. See how the search method behaves when other resource-intensive processes are running on that machine (CPU-, disk-, memory-, network- intensive).

If the collection of records is kept in a database, then I can imagine a host of other stuff to test that is database-related. Same if the collection is retrieved over the network.

As I said, this is just an initial stab at testing the search method. I'm sure people can come up with many more things to test. But I think this provides a pretty solid base and a pretty good automated test suite for the AUT.

I can think of many more tests that should be run if the search application talks to a database, or if it retrieves the search results via a Web service for example. I guess this all shows that a tester's life is not easy :-) -- but this is all exciting stuff at the same time!

Sunday, July 24, 2005

Django cheat sheet

Courtesy of James: Django cheat sheet. I went trough the first 2 parts of the Django tutorial and I have to say I'm very impressed. Can't wait to give it a try on a real Web application.

Friday, July 22, 2005

Slides from 'py library overview' presentation

I presented an overview of the py library last night at our SoCal Piggies meeting. Although I didn't cover all the tools in the py library, I hope I managed to heighten the interest in this very useful collection of modules. You can find the slides here. Kudos again to Holger Krekel and Armin Rigo, the main guys behind the py lib.

And while we're on this subject, let's make py.test the official unit test framework for Django!!! (see the open ticket on this topic)

Friday, July 15, 2005

Installing Python 2.4.1 and cx_Oracle on AIX

I just went through the pain of getting the cx_Oracle module to work on an AIX 5.1 server running Oracle 9i, so I thought I'd jot down what I did, for future reference.

First of all, I had ORACLE_HOME set to /oracle/OraHome1.

1. Downloaded the rpm.rte package from the AIX Toolbox Download site.
2. Installed rpm.rte via smit.
3. Downloaded (from the same AIX Toolbox Download site) and installed the following RPM packages, in this order:
rpm -hi gcc-3.3.2-5.aix5.1.ppc.rpm
rpm -hi libgcc-3.3.2-5.aix5.1.ppc.rpm
rpm -hi libstdcplusplus-3.3.2-5.aix5.1.ppc.rpm
rpm -hi libstdcplusplus-devel-3.3.2-5.aix5.1.ppc.rpm
rpm -hi gcc-cplusplus-3.3.2-5.aix5.1.ppc.rpm
4. Made a symlink from gcc to cc_r, since many configuration scripts find cc_r as the compiler of choice on AIX, and I did not have it on my server.
ln -s /usr/bin/gcc /usr/bin/cc_r
5. Downloaded Python-2.4.1 from python.org.
6. Installed Python-2.4.1 (note that the vanilla ./configure failed, so I needed to run it with --disable-ipv6):
gunzip Python-2.4.1.tgz
tar xvf Python-2.4.1.tar
cd Python-2.4.1
./configure --disable-ipv6
make
make install
7. Downloaded cx_Oracle-4.1 from sourceforge.net.
8. Installed cx_Oracle-4.1 (note that I indicated the full path to python, since there was another older python version on that AIX server):
bash-2.05a# /usr/local/bin/python setup.py install
running install
running build
running build_ext
building 'cx_Oracle' extension
creating build
creating build/temp.aix-5.1-2.4
cc_r -pthread -fno-strict-aliasing -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -I/oracle/OraHome1/rdbms/demo -I/oracle/OraHome1/rdbms/public -I/oracle/OraHome1/network/public -I/usr/local/include/python2.4 -c cx_Oracle.c -o build/temp.aix-5.1-2.4/cx_Oracle.o -DBUILD_TIME="July 15, 2005 14:49:28"
In file included from /oracle/OraHome1/rdbms/demo/oci.h:2138,
from cx_Oracle.c:9:
/oracle/OraHome1/rdbms/demo/oci1.h:148: warning: function declaration isn't a prototype
In file included from /oracle/OraHome1/rdbms/demo/ociap.h:190,
from /oracle/OraHome1/rdbms/demo/oci.h:2163,
from cx_Oracle.c:9:
/oracle/OraHome1/rdbms/public/nzt.h:667: warning: function declaration isn't a prototype
/oracle/OraHome1/rdbms/public/nzt.h:2655: warning: function declaration isn't a prototype
/oracle/OraHome1/rdbms/public/nzt.h:2664: warning: function declaration isn't a prototype
/oracle/OraHome1/rdbms/public/nzt.h:2674: warning: function declaration isn't a prototype
/oracle/OraHome1/rdbms/public/nzt.h:2683: warning: function declaration isn't a prototype
/oracle/OraHome1/rdbms/public/nzt.h:2692: warning: function declaration isn't a prototype
/oracle/OraHome1/rdbms/public/nzt.h:2701: warning: function declaration isn't a prototype
/oracle/OraHome1/rdbms/public/nzt.h:2709: warning: function declaration isn't a prototype
/oracle/OraHome1/rdbms/public/nzt.h:2719: warning: function declaration isn't a prototype
In file included from /oracle/OraHome1/rdbms/demo/oci.h:2163,
from cx_Oracle.c:9:
/oracle/OraHome1/rdbms/demo/ociap.h:6888: warning: function declaration isn't a prototype
/oracle/OraHome1/rdbms/demo/ociap.h:9790: warning: function declaration isn't a prototype
/oracle/OraHome1/rdbms/demo/ociap.h:9796: warning: function declaration isn't a prototype
In file included from Variable.c:93,
from Cursor.c:211,
from Connection.c:303,
from SessionPool.c:132,
from cx_Oracle.c:73:
DateTimeVar.c: In function `DateTimeVar_SetValue':
DateTimeVar.c:81: warning: unused variable `status'
creating build/lib.aix-5.1-2.4
/usr/local/lib/python2.4/config/ld_so_aix cc_r -pthread -bI:/usr/local/lib/python2.4/config/python.exp build/temp.aix-5.1-2.4/cx_Oracle.o -L/oracle/OraHome1/lib -lclntsh -o build/lib.aix-5.1-2.4/cx_Oracle.so -s
ld: 0711-317 ERROR: Undefined symbol: .OCINumberFromInt
ld: 0711-317 ERROR: Undefined symbol: .OCINumberFromReal
ld: 0711-317 ERROR: Undefined symbol: .OCINumberFromText
ld: 0711-317 ERROR: Undefined symbol: .OCINumberToReal
ld: 0711-317 ERROR: Undefined symbol: .OCINumberToText
ld: 0711-317 ERROR: Undefined symbol: .OCINumberToInt
ld: 0711-317 ERROR: Undefined symbol: .OCIParamGet
ld: 0711-317 ERROR: Undefined symbol: .OCIDescriptorFree
ld: 0711-317 ERROR: Undefined symbol: .OCIAttrGet
ld: 0711-317 ERROR: Undefined symbol: .OCIStmtExecute
ld: 0711-317 ERROR: Undefined symbol: .OCISessionGet
ld: 0711-317 ERROR: Undefined symbol: .OCIServerDetach
ld: 0711-317 ERROR: Undefined symbol: .OCITransRollback
ld: 0711-317 ERROR: Undefined symbol: .OCISessionEnd
ld: 0711-317 ERROR: Undefined symbol: .OCISessionRelease
ld: 0711-317 ERROR: Undefined symbol: .OCIHandleFree
ld: 0711-317 ERROR: Undefined symbol: .OCIHandleAlloc
ld: 0711-317 ERROR: Undefined symbol: .OCIAttrSet
ld: 0711-317 ERROR: Undefined symbol: .OCITransStart
ld: 0711-317 ERROR: Undefined symbol: .OCISessionPoolCreate
ld: 0711-317 ERROR: Undefined symbol: .OCIErrorGet
ld: 0711-317 ERROR: Undefined symbol: .OCIEnvCreate
ld: 0711-317 ERROR: Undefined symbol: .OCINlsNumericInfoGet
ld: 0711-317 ERROR: Undefined symbol: .OCISessionPoolDestroy
ld: 0711-317 ERROR: Undefined symbol: .OCITransCommit
ld: 0711-317 ERROR: Undefined symbol: .OCITransPrepare
ld: 0711-317 ERROR: Undefined symbol: .OCIBreak
ld: 0711-317 ERROR: Undefined symbol: .OCIUserCallbackRegister
ld: 0711-317 ERROR: Undefined symbol: .OCIUserCallbackGet
ld: 0711-317 ERROR: Undefined symbol: .OCIServerAttach
ld: 0711-317 ERROR: Undefined symbol: .OCISessionBegin
ld: 0711-317 ERROR: Undefined symbol: .OCIStmtRelease
ld: 0711-317 ERROR: Undefined symbol: .OCIDescriptorAlloc
ld: 0711-317 ERROR: Undefined symbol: .OCIDateTimeConstruct
ld: 0711-317 ERROR: Undefined symbol: .OCIDateTimeCheck
ld: 0711-317 ERROR: Undefined symbol: .OCIDateTimeGetDate
ld: 0711-317 ERROR: Undefined symbol: .OCIDateTimeGetTime
ld: 0711-317 ERROR: Undefined symbol: .OCILobGetLength
ld: 0711-317 ERROR: Undefined symbol: .OCILobWrite
ld: 0711-317 ERROR: Undefined symbol: .OCILobTrim
ld: 0711-317 ERROR: Undefined symbol: .OCILobRead
ld: 0711-317 ERROR: Undefined symbol: .OCILobFreeTemporary
ld: 0711-317 ERROR: Undefined symbol: .OCILobCreateTemporary
ld: 0711-317 ERROR: Undefined symbol: .OCIDefineByPos
ld: 0711-317 ERROR: Undefined symbol: .OCIStmtGetBindInfo
ld: 0711-317 ERROR: Undefined symbol: .OCIStmtPrepare2
ld: 0711-317 ERROR: Undefined symbol: .OCIStmtFetch
ld: 0711-317 ERROR: Undefined symbol: .OCIBindByName
ld: 0711-317 ERROR: Undefined symbol: .OCIBindByPos
ld: 0711-345 Use the -bloadmap or -bnoquiet option to obtain more information.
collect2: ld returned 8 exit status
running install_lib
At this point, I did a lot of Google searches to find out why the loader emits these errors. I finally found the solution: Oracle 9i installs the 64-bit libraries in $ORACLE_HOME/lib and the 32-bit libraries in $ORACLE_HOME/lib32. Since setup.py is looking by default in $ORACLE_HOME/lib (via -L/oracle/OraHome1/lib), it finds the 64-bit libraries and it fails with the above errors. The quick hack I found was to manually re-run the last command that failed and specify -L/oracle/OraHome1/lib32 instead of -L/oracle/OraHome1/lib (I think the same effect can be achieved via environment variables such as LIBPATH).
bash-2.05a# /usr/local/lib/python2.4/config/ld_so_aix cc_r -pthread -bI:/usr/local/lib/python2.4/config/python.exp build/temp.aix-5.1-2.4/cx_Oracle.o -L/oracle/OraHome1/lib32 -lclntsh -o build/lib.aix-5.1-2.4/cx_Oracle.so -s
Then I re-ran setup.py in order to copy the shared library to the Ptyhon site-packages directory:

bash-2.05a# /usr/local/bin/python setup.py install
running install
running build
running build_ext
running install_lib
copying build/lib.aix-5.1-2.4/cx_Oracle.so -> /usr/local/lib/python2.4/site-packages


At this point I was able to import cx_Oracle at the Python prompt:

bash-2.05a# /usr/local/bin/python
Python 2.4.1 (#1, Jul 15 2005, 14:44:07)
[GCC 3.3.2] on aix5
Type "help", "copyright", "credits" or "license" for more information.
>>> import cx_Oracle
>>> dir(cx_Oracle)
['BINARY', 'BLOB', 'CLOB', 'CURSOR', 'Connection', 'Cursor', 'DATETIME', 'DataError', 'DatabaseError', 'Date', 'DateFromTicks', 'Error', 'FIXED_CHAR', 'FNCODE_BINDBYNAME', 'FNCODE_BINDBYPOS', 'FNCODE_DEFINEBYPOS', 'FNCODE_STMTEXECUTE', 'FNCODE_STMTFETCH', 'FNCODE_STMTPREPARE', 'IntegrityError', 'InterfaceError', 'InternalError', 'LOB', 'LONG_BINARY', 'LONG_STRING', 'NUMBER', 'NotSupportedError', 'OperationalError', 'ProgrammingError', 'ROWID', 'STRING', 'SYSDBA', 'SYSOPER', 'SessionPool', 'TIMESTAMP', 'Time', 'TimeFromTicks', 'Timestamp', 'TimestampFromTicks', 'UCBTYPE_ENTRY', 'UCBTYPE_EXIT', 'UCBTYPE_REPLACE', 'Warning', '__doc__', '__file__', '__name__', 'apilevel', 'buildtime', 'connect', 'makedsn', 'paramstyle', 'threadsafety', 'version']

Thursday, July 14, 2005

py lib gems: greenlets and py.xml

I've been experimenting with various tools in the py library lately, in preparation for a presentation I'll give to the SoCal Piggies group meeting this month. The py lib is choke-full of gems that are waiting to be discovered. In this post, I'll talk a little about greenlets, the creation of Armin Rigo. I'll also briefly mention py.xml.

Greenlets implement coroutines in Python. Coroutines can be seen as a generalization of generators, and it looks like the standard Python libray will support them in the future via 'enhanced generators' (see PEP 342). Coroutines allow you to exit a function by 'yielding' a value and switching to another function. The original function can then be re-entered, and it will continue execution from exactly where it left off.

The greenlet documentation offers some really eye-opening examples of how they can be used to implement generators for example. Another typical use case for greenlets/coroutines is turning asynchronous or event-based code into normal sequential control flow code -- the Python Desktop Server project has a good example of exactly such a transformation.

I've also been reading and looking at the code from Armin's EuroPython talk on greenlets. The talk itself must have been highly entertaining, since it is presented as a PyGame-based game. In one of the code examples I downloaded, I noticed yet another application of the asynchronous-to-sequential transformation, this time related to parsing XML data. In a few lines of code, Armin showed how to turn an asynchronous, Expat-based parsing mechanism into a generator that yields the XML elements one at a time. This approach combines the advantages of 1) using a stream oriented parser (and thus being able to process large amounts of XML data via handlers) with 2) using a generator to expose the XML parsing code in the shape of an iterator.

Here is Armin's code which I saved in a module called iterxml.py (I made a few minor modifications to make the code more general-purpose):

from py.magic import greenlet
import xml.parsers.expat

def send(arg):
greenlet.getcurrent().parent.switch(arg)

# 3 handler functions
def start_element(name, attrs):
send(('START', name, attrs))
def end_element(name):
send(('END', name))
def char_data(data):
data = data.strip()
if data:
send(('DATA', data))

def greenparse(xmldata):
p = xml.parsers.expat.ParserCreate()
p.StartElementHandler = start_element
p.EndElementHandler = end_element
p.CharacterDataHandler = char_data
p.Parse(xmldata, 1)

def iterxml(xmldata):
g = greenlet(greenparse)
data = g.switch(xmldata)
while data is not None:
yield data
data = g.switch()

Consumers of this code can pass a string containing an XML document to the iterxml function and then use a for loop to iterate through the elements yielded by the function, like this:

for data in iterxml(xmldata):
print data

When iterxml first executes, it instantiates a greenlet object and associates it with the greenparse function. Then it 'switches' into the greenlet and thus is calls that function with the given xmldata argument. There is nothing out of the ordinary in the greenparse function, which simply assigns the 3 handler functions to the xpat parser object, then calls its Parse method. However, the 3 handler functions all use greenlets via the send method, which sends the parsed data to the parent of the current greenlet. The parent in this case is the iterxml function, which yields the data at that point, then switches back into the greenparse function. The handler functions then get called again whenever a new XML element is encountered, and the switching back and forth continues until there is no more data to be parsed.

I've wanted for a while to check out the REST API offered by upcoming.org (which is a free alternative to meetup.com), so I used it in conjunction with the XML parsing stuff via greenlets.

Here's some code that uses the iterxml module to parse the response returned by upcoming.org when a request for searching the events in the L.A. metro area is sent to their server:

import sys, urllib
from iterxml import iterxml
import py

baseurl = "http://www.upcoming.org/services/rest/"
api_key = "YOUR_API_KEY_HERE"
metro_id = "1" # L.A. metro area
log = py.log.Producer("")

def get_venue_info(venue_id):
method = "venue.getInfo"
request = "%s?api_key=%s&method=%s&venue_id=%s" %
(baseurl, api_key, method, venue_id)
response = urllib.urlopen(request).read()
for data in iterxml(response):
if data[0] == 'START' and data[1] == 'venue':
attr = data[2]
venue_info = "%(name)s in %(city)s" % attr
break
return venue_info

def search_events(keywords):
method = "event.search"
request = "%s?api_key=%s&method=%s&metro_id=%s&search_text=%s" %
(baseurl, api_key, method, metro_id, keywords)
response = urllib.urlopen(request).read()
for data in iterxml(response):
if data[0] == 'START' and data[1] == 'event':
attr = data[2]
log("\n" + "-" * 80)
log.EVENT("%(name)s" % attr)
log.WHAT("%(description)s" % attr)
log.WHERE(get_venue_info(attr['venue_id']))
log.WHEN("%(start_date)s @ %(start_time)s" % attr)

if __name__ == "__main__":
if len(sys.argv) < 2:
print "Usage: %s " % sys.argv[0]
sys.exit(1)

keywords = "%%20".join(sys.argv[1:])
search_events(keywords)

An upcoming.org API key is automatically generated for you when you click here.

I tested the script by searching for Python-related events in L.A.:

./upcoming_search.py python

--------------------------------------------------------------------------------
[EVENT] SoCal Piggies July Meeting
[WHAT] Monthly meeting of the Southern California Python Interest Group.
[WHERE] USC in Los Angeles
[WHEN] 2005-07-26 @ 19:00:00

Note that I'm also using the py.log facilities I mentioned in a previous post. The only thing I needed to do was to instantiate a log object via log = py.log.Producer("") and then use it via keywords such as EVENT, WHAT, WHERE and WHEN. Since I didn't declare any log consumer, the default consumer is used, which prints its messages to stdout. Each message string is nicely prefixed by the corresponding keyword.

I'm still experimenting with greenlets, and I'm sure I'll use them in the future especially for event-based GUI code.

I want to also briefly touch on py.xml, a tool that allows you to generate XML and HTML documents almost painlessly from your Python code.

Here's the XML returned by the event.search method of upcoming.org when called with a search text of 'python':

<rsp stat="ok" version="1.0">
<event id="24868" name="SoCal Piggies July Meeting" description="Monthly meeting of the Southern California Python Interest Group." start_date="2005-07-26" end_date="0000-00-00" start_time="19:00:00" end_time="21:00:00" personal="0" selfpromotion="0" metro_id="1" venue_id="7425" user_id="14959" category_id="4" date_posted="2005-07-13" />
</rsp>


And here's how to generate the same XML output with py.xml:

import py

class ns(py.xml.Namespace):
"my custom xml namespace"

doc = ns.rsp(
ns.event(
id="24868",
name="SoCal Piggies July Meeting",
description="Monthly meeting of the Southern California Python Interest Group.",
start_date="2005-07-26",
end_date="0000-00-00",
start_time="19:00:00",
end_time="21:00:00",
personal="0",
selfpromotion="0",
metro_id="1",
venue_id="7425",
user_id="14959",
category_id="4",
date_posted="2005-07-13"),
stat="OK",
version="1.0",
)

print doc.unicode(indent=2).encode('utf8')


The code above prints:

<rsp stat="OK" version="1.0">
<event category_id="4" date_posted="2005-07-13" description="Monthly meeting of the Southern California Python Interest Group." end_date="0000-00-00" end_time="21:00:00" id="24868" metro_id="1" name="SoCal Piggies July Meeting" personal="0" selfpromotion="0" start_date="2005-07-26" start_time="19:00:00" user_id="14959" venue_id="7425"/></rsp>


As the py.xml documentation succintly puts it, positional arguments are child-tags and keyword-arguments are attributes. In addition, indentation is also available via the argument to the unicode method.

I intend to cover other tools from the py library in future posts. Stay tuned for discussions on py.execnet, little-known aspects of py.test and more!

Friday, July 01, 2005

Recommended reading: Jason Huggins's blog

I recently stumbled on Jason's blog via the Thoughtworks RSS feed aggregator. Jason is the creator of Selenium and a true Pythonista. His latest post on using CherryPy, SQLObject and Cheetah for creating a 'Ruby on Rails'-like application is very interesting and entertaining. Highly recommended! Hopefully the Subway guys will heed Jason's advice of focusing more on "ease of installation and fancy earth-shatteringly beautiful 10 minute setup movies" -- this is one area in which it's hard to beat the RoR guys, but let's at least try it!

Modifying EC2 security groups via AWS Lambda functions

One task that comes up again and again is adding, removing or updating source CIDR blocks in various security groups in an EC2 infrastructur...