This article describes a conceptual framework for designing evaluation studies of test kits for the analysis of significant drinking water constituents. A commercial test kit for the analysis of lead in tap waters was evaluated and compared with a standard graphite furnace atomic absorption spectrophotometry (GFAAS) technique. The kit was relatively free of operator bias and had a detection limit of 4 ug/L in spiked deionized water. Above the detection limit of the kit, accuracy was comparable to that of GFAAS when samples were analyzed in triplicate. The relative precision of the kit varied with concentration. The absolute precision of the kit was about plus or minus 3 ug/L from 10 to 100 ug/L. Significant interferences were found for certain concentration thresholds of zinc, iron (Fe(II)), polyphosphate species, and orthophosphate. High concentrations of aluminum and chloride reduced method precision. Several operational changes are presented that improve precision and accuracy of the test kit for field and laboratory use. Includes 13 references, tables, figures.