Model Identifiers

Model Identifiers serve three functions.

  1. Provides a unique identifier for a model in the catalog.
  2. Becomes a file-name in the model store.
  3. Becomes an object-name in 3d modeling tools.

These function are essential for linking models with catalog records throughtout their lifecycle and ensuring a round-trip capability in exchanges between design tools and GIS-based model management tools

With this in mind, we would prefer to make the identifiers compact. For example, we don't think it is necessary to use a 32-digit universally unique ID if we can get away with a 5-digit string that will be easy to find.

This page tries to answer the question as to whether an 8-digit string of upper-case letters and 9 numerals will be a long-enough random string to avoid the minutest possibility of a name collision.

The problem is similar to the famous Birthday Problem as described in Wikipedia

The classic description of the Birthday Problem is if you have 23 students in a class, what is the probability of 2 students having the same birthday? The answer is surprising probability = 0.507 or a 50% chance of a duplicate birthday.

The Wikipedia page about the Birthday Problem provides a formula for estimating the probability for other types of events.

From Wikipedia page about the Birthday Problem

In our case we are interested in the probability of two models getting assigned the same random model ID over the lifetime of the boston city model project. Our variables would be:

  • How many possible model IDs can be randomly generated from a character string of upper-case letters and numerals? Our strings will be made of the characters (1-9,A-Z) 36. Raised to the power of the number of random characters in the ID would be approximately 6 followed by 21 zeros This is comparable to the number of days in the year in the classic definition of the birthday problem
  • How many possible models may exist in in the building model collection? This variable corresponds with the number of students.

Lets say that the boston model collection has 150,000 models. Over its lifetime, these are completely changed out 100 times. Since retired models remain in the catalog this scenario we might expect 15,000,000 potential models. That is sure to be an over-estimate. This term is comparable to Birthday Problem's number of students in the class.

We devised a spreadsheet with the estimating formula provided on the Wikipedia page which you can download. birthday_spreadsheet.odt

After messing around with this, we determined that a model id with 15 random characters a new model ID would have a vanishingly small, 1 in 2,000,000,000 chance of colliding with another ID in a population of 15,000,000 models.

Anticipating that some applications may mix models from different territories, like Cambridge or Harvard, our ID scheme prepends a three-character territory ID to assure that these foreign identifiers are in a distinct namespace.

Our model identifier format will be:

  • BOS-A1B2C-3D5E6-F7G8H