The Population Analyst includes integrated datasets that users can select from a menu. As originally created, it included a single collection of these datasets, New York Counties 2000. These collections reside in subfolders of the Population Analyst’s data directory, as shown below. (The generic folder contains default projection parameters for uploaded or manually-entered datasets.)

Each subdirectory of data besides generic is intepreted as a different dataset collection. These subdirectories may not have spaces in their names, so underscores are used instead (underscores are replaced with spaces when displayed to the user).
Each dataset collection contains four subdirectories, named fertility, migration, population, and survival.

Each subfolder contains a plain text file for each dataset in the collection. As with the collection folders, underscores should be used instead of spaces in these dataset names. The exact same filenames should be used in each of the subdirectories.
The image below shows how there is a separate population file for each county dataset in the collection. Each of these population files uses the standard Population Analyst text file format. When preparing a new Population Analyst dataset collection from data in spreadsheet form, the Excel to Population Analyst conversion guide may be helpful.

The files in the fertility subdirectory employ a similar format. The first three lines contain the place name, year, and source of the fertility data. These lines are followed by eight lines corresponding to the fertility rates (children per woman per year) for the females in the following age groups:
An example:

These rates are multiplied by the populations of the corresponding female cohorts and the number of years in the projection interval to estimate the number of children born during that interval. The Population Analyst assumes the male/female distribution of births to be even. (This ratio is defined in the code file projection.cgi as $mfratio; search for this term and adjust the 0.5 value assigned to it to alter the Population Analyst’s birth ratio).
The survival rate files follow the population file format: three lines of identifying text followed by 36 lines of numbers, each corresponding to a particular age/sex cohort. These survival rates specify the likelihood of an individual in that cohort surviving for five more years. These values are multiplied by the initial cohort populations to determine how many survive to the next interval; a value of 1 indicates no deaths whatsoever.

The migration rate files adhere to the population file format as well. Each number expresses the change likely due to migration during a five-year interval; positive values represent in-migration and negative values represent out-migration. These values are multiplied by the surviving population to yield the number of net migrants, which is added back to that population to yield the final post-migration population.
The migration rates for the youngest male and female cohorts should be set to zero as the people in that category were not present at the beginning of the migration measurement or projection interval.

Population counts for nearly any US geography are available from the Census Bureau’s American FactFinder web site.
Age-specific fertility and survival rates for New York were available from the state Vital Statistics page (Tables 8 and 36, respectively). These figures were converted from their per-1000 rates to decimal percentages using the actual populations of the corresponding cohorts. Similar sources will have to be found for new dataset collections.
Migration rates were derived from the county migration table b2_table3_050.txt from the Census 2000 Migration Data DVD. The New York counties were extacted by FIPS code, their net migration counts split in half (assuming males and females migrate at the same rate), and these counts divided by the corresponding cohort populations to yield percentage rates. A similar process should be possible for other state county collections, but different sources will need to be found for other geographies.
To create a new dataset collection for use with the Population Analyst, emulate the folder structure described above, create the data files as also described above, and upload the final example dataset to the Population Analyst’s data directory as shown below.
