This article is off topic and likely not of general interest.
I was commenting over at Birther Report, explaining why Zullo’s race codes were fake, not federal codes nor Hawaiian codes. I showed from an example certificate that code “3” stood for the race “part-Hawaiian” on a 1961 Hawaiian birth certificate. One of the rather dim bulbs that comment there argued based on the same certificate:
[A] 3 is also placed by the "steamship" company, what kind of race is steam ship company. Looks more like people making things up as they go on this piece of forged paper.
That comment stuck me as pretty dumb, but then I had to realize that I am somewhat of coding wonk. Maybe some people think that the same number represents the same thing no matter what the context. Codes are meaningful only in the context of a code set, like a set of race codes, or occupation codes, or diagnosis codes.
I participated in a number of the Public Health Information Network (PHIN) conferences where coding standards (vocabularies and code sets) were an important topic. I was also a member of Health Level 7 (HL7) which deals with protocols for electronically exchanging data in a way such that systems can understand each other. I was one of the contributors to the 2006 Implementation Guide for Immunization Data Transactions using Version 2.3.1 of the Health Level Seven (HL7) Standard Protocol and I presented at several conferences including:
- Adapting Vendor Clinical Systems for Real-Time Registry Participation using HL7
- An HL7-Centric Immunization Registry
- Transport of Immunization HL7 transactions over the Internet
I said before that codes are meaningful only in the context of a code set, so how do you identify the code set that a code comes from? It’s simple, you assign codes to code sets (see example at the end of this article). 😉
In HL7, the race code set comes from the US Office of Management and Budget, and HL7 designates the value set as User Defined Table 0005. Note the two different terms, "code set" and "value set." A code set is the comprehensive list of codes defined. A value set is a list of codes from the code set that may be used for a particular application. See the race table in the Immunization HL7 Implementation Guide, Page A1-1.
In the case of Barack Obama’s birth certificate, his father’s race of African was coded “9” which in the 1961 Hawaiian code set stood for “Other Race.” Presumably because of the low number of black residents in Hawaii at the time, there was no separate code for “black.” The whole concept of “other” presents an interesting problem because one doesn’t necessarily know whether “other” means not known but potentially available, not in the code set, not known and not obtainable, etc. See the article “Flavors of Null” by Barry Smith.
Computer programmers are probably familiar with something called a universally unique identifier (UUID), one example of which is the more familiar globally unique identifier (GUID). A GUID is a string of digits (expressed in hexadecimal notation) that’s supposed to be unique. Here’s one I just made for you, and presumably you can search the world over and never find another like it:
GUIDs are not necessarily unique; the Wikipedia says:
Assuming uniform probability for simplicity, the probability of one duplicate would be about 50% if every person on earth as of 2014 owned 600 million GUIDs.
One way to uniquely identify something is through a registration authority, for example the Internet Corporation for Assigned Names and Numbers (ICANN). The obamaconspiracy.org domain is globally unique and I can then use it as a context for anything else I want to define. For example, each comment on this blog has a comment ID, a number; however, that number is not unique; it exists on many blogs. I can uniquely identify a comment here by the combination of the obamaconspiracy.org domain and the comment ID. Alternately I could tell everybody about that GUID above and say it means obamaconspiracy.org and attach the comment ID to that.
There is another registration methodology that is used in healthcare (and elsewhere) called an “object identifier” or OID. OIDs are represented with numbers joined by dots, the same way an IP address appears. The first digit is the high-level assigning authority:
- 0 – The International Telecommunications Union, ITU-T
- 1 – International Standards Organization
- 2 – Joint ITU-T, ISO
OIDs are used in many contexts from medical records to security certificates, and they can be used to identify code sets, such as the PHIN code set for race, with OID 2.16.840.1.1138126.96.36.19914. They are also used in SNMP management:
When I was in private business, our company had an OID through the Department of Defense (common for US companies), subordinate to the ISO. This was our OID hierarchy:
- 188.8.131.52.4.1 – IANA-registered Private Enterprises
- 184.108.40.206.4 – Internet Private
- 220.127.116.11 – OID assignments from 18.104.22.168 – Internet
- 1.3.6 – US Department of Defense
- 1.3 – ISO Identified Organization
- 1 – ISO assigned OIDs
- Top of OID tree
The final company OID was 22.214.171.124.4.1.27263. With that, the company could then add onto the end of the hierarchy indefinitely, numbering anything. In the best of all possible worlds, there is a web page somewhere that explains all the codes defined. Unfortunately, the company I worked for got bought out, I retired, and the OID web page with all of its sub-definitions exists now only on the Wayback Machine. It’s not cool for these things to go away. If you would like to lookup an OID, try oid-info.com.
So can you get your own OID? Yes. There are several ways to get an OID, some free and some for a fee—some quick and some that take a while. If you have your own country, you can get an OID for your country, and then start assigning numbers from your own node. You can also attach a UUID to the OID hierarchy and use that.
So the exercise for the rest of the article is to get an OID and to number something. Obviously one wants an important-looking OID with lots of panache, a bold Internet presence and some permanence, but that’s hard to do if you’re not a company or a country. If you are are a private enterprise, then I’d suggest the DOD route. For other options, see the OID Repository FAQ. For our purposes, we could attach a UUID onto the OID tree. UUID-based OIDs all begin 2.25. Rather than use the GUID above, I’ll generate a UUID based on the ITU algorithm implementation on their web site.
OIDs, except for the final entry, must be decimal numbers, so we’ll have to convert that hex string to decimal. I found a web page at mobilefish.com that will do that, and the answer is:
I could then use 2.25.165606865786509621373610282052264248603 (called an “arc”) as a start of any coding systems I can imagine, plus assigning them to other people. However, that OID is pretty plug ugly, and extending it could create a result longer than the 64 character limit in some software.
An alternative that is easy for individuals is a free service from a company called ThinkSoft. It’s quick to have a number assigned and register it. Here is the OID I received: 126.96.36.199.4.1.37476.9000.7 (note that this is also under the DOD arc my former company used).
So let’s start numbering:
- 188.8.131.52.4.1.37476.9000.7 – Kevin Davidson’s OID arc
- 184.108.40.206.4.1.37476.9000.7.0 – Kevin Davidson’s domains
- 220.127.116.11.4.1.37476.9000.7.0.0 – blogordie.com
- 18.104.22.168.4.1.37476.9000.7.0.1 – obamaconspiracy.org
- 22.214.171.124.4.1.37476.9000.7.0.1.0 – obamaconspiracy.org articles
- 126.96.36.199.4.1.37476.9000.7.0.1.1 – obamaconspiracy.org comments
- 188.8.131.52.4.1.37476.9000.7.0.1.1.10821 – first comment at obamaconspiracy.org using the phrase “any day now”
More OID links: