Follow Slashdot blog updates by subscribing to our blog RSS feed


Forgot your password?
GNU is Not Unix

TIGER/Line 1997 data set to be released as GPL 73

Bruce Perens writes: "In an effort to seed the development of Open Source(TM) auto-navigation software, mapping web sites, and other geographical applications, I have purchased the TIGER/Line 1997 data set, and will be reissuing it on CD under the GPL. This is a complete U.S. map database, with GPS coordinates, street names, etc. I am offering one 6-CD set to each of Debian, GNOME, and KDE, and will consider other worthy non-profit groups with a history of finishing what they start. They will be able to use this data to develop navigation applications, etc., under the GPL or LGPL" The FAQ specifies this does not include topological data, but Bruce is looking into other sources of that.
This discussion has been archived. No new comments can be posted.

TIGER/Line 1997 data set to be released as GPL

Comments Filter:
  • Whoohoo! I'd like to see more "seed" stuff like this.

    As for topographical data,
    US Geographic Survey for inside the US
    Central Intelligence Agency (yeah yeah, but they have it) for outside the US

    I think you can get that info in a digital format from either agency under the Freedom of Information Act.
  • You rock.

    Is this the same data that's in Street Atlas USA and stuff?

  • The Open Source Definition, as far as I know english and the real world, applies to software, not data... I guess this you're offering is not freely redistributable source code compilable and runnable with the standard components of some OS, or binaries produced from such code.

    I know there are books and such `released under GPL', but frankly, the only sense I can make of that is that you must distribute it with at least an offer machine readable `source code' for the book, that is, the files that will allow you to produce your own verbatim or modified copies of the book (like the LaTeX or SGML source).

    So I guess you must mean that one whoever distributes this in say, printed form, must accompany it with a written offer to give an electronic version. Say I use some of this data, which I obtain under GPL, in an academic paper. Is this a derived work of such data? Must I accompany it with a written offer, valid for three years, to give any of the readers a machine readable version of that data for no more that the cost of making physical copies of it?

    It all seems very murky to me...

    But anyway it's nice of you to freely redistribute that data...


  • and therefore can do what he wants with it, because he owns the license. A very selfless act, and quite impressive. I applaud you, Bruce.

    As for putting it on the net, there's nothing stopping him or anyone else. It's under the GPL license now.

  • looks like I should have done a tad more research. It was public domain to begin with, so he's legally allowed to do with it what he wants.
  • I was always wondering why they released source to RTEMS - Now I know why, they were obligated to.
  • From the aforementioned NOAA site.

    Not sure how useful this data is, but it's available for $90, and seems to have a wide variety of global data. (Although in lower resolution than some USGS data...)

    Bruce's next target? This one should be easy to recover the costs of... Or even better, get it to be a CheapBytes offering...
  • The FAQ says that the data set is public domain anyway so what value would a GPL'd CD add? Also, how "open" is this "Open Source(tm)" representative when he selectively gives away stuff like this and excludes anyone he feels isn't worthy?

    I'm not trying to say what he should do, he can do what he wants but it just seems kind of ironic to me that he would buy something up that is public anyway, then GPL it only to keep it to himself and what he feels are a few "worthy" companies.
    - []
  • First off, I'm not a lawyer, so no suing me if this is wrong.

    Anyway, the Tiger data set is public domain. If you have access to a copy of it, you can copy it freely, yadda yadda yadda. However, no one is required to make a copy available to you. As public domain, it can never be copyrighted in its current form, public domain is forever and the author has lost any and all ability to control his or her work. (This is identical to when a copyright expires.)

    However, what you can do is make a trivial change to public domain data, and copyright that. That does not change the lack of copyright on the original data, but your version is under whatever copyright restrictions you wish to place on it. If a person doesn't have access to the original, but can get yours, he or she is bound to your restrictions.

    I'm not sure if there is any sort of limit on how trivial the change must be. I think it just has to serve as an identifying marker, enough to identify the data as having been copied from your version as opposed to the original public domain data.

    U.S. Government materials are generally public domain.

    What Bruce has done is to pay the $1500 the government is charging for a copy of the data. He can now copy it and redistribute it as he likes, or modify it and add copyright restrictions.

    Remember, I'm not a lawyer.
  • The US government makes a whole lot of mapping data available free.

    Digital Raster Graphics (DRGs) [] are scanned USGS quadrangle maps. They're great big TIFF files--the one map one has of a section of Wyoming is 7.4M--but very nice-looking, and usable by Mayko's xmap. Some are available free; others you need to pay for, at the rate of $45 per CD-R plus $1 per file you stick on it.

    The USGS [] also offers Digital Elevation Models (DEMs), Digital Line Graphs (DLGs), and Land Use and Land Cover (LULC) data, at varying scales.
    DEMs [] are elevation data at regularly spaced points.
    DLGs [] are vector data for topographic lines, hydrography (flowing and standing water and wetlands), roads, trails, railways, pipelines, transmission lines, and state, county, city, and other borders. Names are included.
    LULC [] files "describe the vegetation, water, natural surface, and cultural features on the land surface." This includes such things as residential/commercial/industrial urban areas, types of cultivated land, 7 variations on 'barren land', and glaciers.

    This NOAA site [] has much the same information as the DEM files on a global scale, also including bathymetric (elevation descriptions of undersea areas) data.

    The Great Lakes Data Rescue [] project has bathymetric data for the Great Lakes.

    And if Bruce is listening, one'd really like a set of these CDs... see, there's this project one's working on to make an open-source browser for any data conceivably represented geographically, like weather maps, airline flight tracking, and so on...

  • Well, a lot of us out here have Cablemodems, DSL lines or T1+ lines.

    I download enough MP3s and RealAudio files, I practically download 6 cds a month anyway!
  • Can he do this?
    Tigerline is cool and we wanted to use it for a

    Palm/GPS/Mapping app. But, couldn't find any free maps w/ streets. There is lots of topographical data from avail for free. We used some of it (500+megs) and used a cgi to create the maps we wanted. This would be AWESOME!
  • The /. post doesn't say it, but he mentioned elsewhere that he's going to sell silver CD sets of the (GPLed copy of the) stuff for an accessible price - I assume that means about what you'd expect to pay for a 6-CD set of GPLed stuff (try one of these "multi-linux" packs).
  • Remember, the dataset is GPL'd now, so there's no obstacle to keep them from doing so.

    Well, he claims it's GPLed... but it's not unless someone makes a derivitive work or change to it and re-releases it. Because the data set is in the public domain (and ALL datasets are, unless they're trade secret, check the phone book business on other sub-threads), you can't just all of a sudden take it and make it more restrictive. Since the GPL is essentially a very restrictive form of PD, you can't further restrict the TIGER license or lack thereof.

    People need to remember that PD != GPL, and the mapping from PD --> GPL can't happen transparently and seamlessly. I can't just take a copy of the phone book and copy-left it unless I have some rights over it to release. To issue a more restrictive license you must have rights over that thing. No one has rights over anything in the PD, so unless you make changes and make it into a copy-rightable item, you can't copy-left it.

    But then again I'm not a lawyer.

  • A data base/ data set cannot be copyrighted, and therefore cannot be copylefted (if there are no intellectual property rights that could be enforced you cannot create a license that is predicated on restricting you from enforcing intellectual property rights).

    It is in the public domain. You cannot GPL something in the public domain unless:

    (1) it COULD be copyrighted (this can't),


    (2) You make "substantial" changes to it.

    This is a data set.

    I think Mr. Perens refers to any derivitive products based on it, such as a mapping program, which can include this for free.

    But it's not GPLed. I can release even the copy he gives me on a proprietary product w/o source code.

    But of course IANAL.
  • It seems that what we need for data like this is for someone to just make CD's and sell them for a reasonable price. Then if/when software is written that can use the data sets the programs can just be distributed and the data CD's purchased to use with the programs.

    Someone like Walnut Creek CDROM or CheapBytes or whoever could just sell the set for say $40. They should be able to recoup their investment fairly quickly.
  • > Who wants to download a 6 CD set off the 'net?

    Me! If it were available. Jeez, like a dialup connection costs me anything except time.
  • Do the math.

    You need a big fast server to dole out reasonable fractions of a 3+ Gig Database. Remember that there are a lot of people who will be riding in on a T1, T3 or Cable Modem with every intention of downloading the whole database.

    It costs lots of money to put a BFS online.

    There is Hardware, Bandwidth and Tech time to set it up right ( even if you do it yourself, you could have been working instead ). Plus that little niggling $50 a year domain registration ( unless you want to make home on a little island :).

    According to my math it's cheaper to burn 18 CDs and Fedex them to each of the listed groups than to put up a server. Bruce can afford to live comfortably ( if that's his wish ). He is not however a wealthy man. A $ 100,000 donation would probably put a strain on his budget.

    My best bet on how to use this. The Programers who will get this treasure should link with each other and hammer out a single work team. I.e. Web setups and a few GPLed application separated more by purpose than by creator. The web version could be plastered on a server managed by these people but earning money on adds for sponsoring other OSS/FSF projects.

    the applications should remain as simple GPL and free download in source and binary so they run on most OSs etc... The Data can then be sold for CD copying prices. Walnut Creak seams to think that amounts to $25 per set ( The price of it's 6 disk Linux set ). A DVD version is of course a nobrainer.

    Work to improve the accuracy of the DATA is also a smart move.

    Question. KDE is headquarters in europe and most of the core team are not Americans. Why should they care ? Or do they need it more as nonresidents ?
  • Coincidentally I've been working on precisely this for several months now. My intention was to release a stripped down subset of the Tiger/Line database optimised for street mapping. (It by default for anyone who hasn't seen it has a *ton* of data that isn't really all that useful).

    I've got Perl code that parses the database and have been experimenting with various ways to index and organize the data to rapidly determine which data segments fall within a given visual area without doing too much real-time sorting. No display stuff yet, but I can take coordinates for a given location and very quickly retrieve the data necessary for displaying the map.

    The code I'm working on is for a module for an SVGALIB based framework for automotive applications under Linux that's going in my car. E-mail, GPS, mapping, ICQ and other such features. I was planning on getting a more formal project going in another month or two when my workload at the job goes down a bit, and I'm certainly interested in talking to anyone else working on similar things.

    This resold Tiger/Line database is great -- the damn thing is free (for anyone who wasn't aware of that) but you've got to find someone who has it... I only have a single CD-ROM, not the full country.

    If anyone wants to discuss these sort of automotive applications for Linux, please feel free to drop me a line at Get rid of -nospam (obviously...)

  • I think Bruce has latched onto a great idea here. Since the end of the Cold War, the US and Russian governments have been relaxing restrictions on much of the data that they have collected, and offering it for sale. Things like remote sensing data, the CIA/KGB world fact book, etc. are useful resources that were previously unavailable. A *very* useful side-effect of GPL'ing it is that it will enforce the use of open file formats. Ever try to use Street Atlas with Linux? :-)

    For those who seem opposed to Bruce recouping his costs or making some money, stuff it. I know you're kind -- you're the idiots that would rather see a community turn into a ghetto rather than allowing someone to make some money rvitalizing it, to the benefit of all.
  • Hi,

    I am researching how to get topological and other data from the U.S. Defense Mapping Agency and the U.S. Geological Survey.

    In the case of the TIGER/Line(R) database, I have taken a public-domain product and have assumed the copyright, for my particular instance of the data. Anyone who has $1500 to throw away (for the Government's costs under the Freedom of Information Act) can do this and choose their own license.

    It happens that I can also distribute this same data under a commercial license, for the non-Open-Source crowd. It might be that I can use that strategy to recover my costs and finance the purchase of more data.



  • If you really want FTP, the 1992 data is here [], but who wants to download 4 GB? Anyway, my version is more recent.


  • The database usually is reduced to a machine-readable "binary" form for use. In text-file form, it's much too large to use practically (4GB _compressed_). So the "binary" provisions in the GPL did fit.



  • Why GPL, when GPL is for source code only?

    I like the GPL because it is the preferred license of the GNU project as well as being Open Source(TM) compliant. Of course I'm an Open Source Initiative board member as well as a GNU supporter. When I can satisfy both groups, I do.

    To be useful, this data has to be made into a binary form and also you have to reduce the data by smoothing points, throwing away irrelevant fields, etc. In its source form it is inpractically large: 4GB compressed text files. It's even larger uncompressed. Once that binary transformation happens, you want to be able to recover the data in source form. Thus, I am treating it as source and binary.

    To make the GPL stick, I will make the data a component of a GPL-ed program and distribute it in that form.

    Why make something public domain less _free_? (GPL _protects_ owner rights, public domain has no owners)

    Many would tell you that the GPL protects everybody's rights. We want to add some value to the database. For example, I know of an unused dirt road that appears as a street in the database: that will be corrected. Indexing will be done, etc. Once we start adding value to this database, we don't want that added value, contributed by the community, to revert to the public domain when it could be GPL-ed.

    Why a select few?

    Those are the few who are getting free CDs. They are the groups that are in the best position to distribute the data to more people, cheaply. sells Linux CD's for less than $3. Why can't you go for around $12 to anyone?

    I will sell CDs, but probably for more than $12. Cheap*Bytes can sell CDs of the data for $12 if they want. My time is worth more than that.

    Free software is a community effort, not a commercial effort.

    Why not both? That is part of what we are trying to say in the Open Source Initiative.


  • DTED level-0 data (which is sufficient for any mapping program) is a free download from the NIMA (, I believe). It's a rather simple (and actually open) format, though it can be a bit hard to comprehend since all of the format stuff refers to it physically on a tape, but I have some source for reading DTED data (any level) which I wrote on a mapping project, which I'm allowed to do with as I see fit (long story). It's basically a bitmapped data file, except that the data is all in signed-magnitude (I have no idea why) instead of 2's complement, though the conversion is real easy, esp. for values >0. :) The format also has stuff for correcting for the satellite's position and all that, though all of the data I've seen (both level 0 and level 1) is pre-processed to put it on a straight grid anyway.

    Also, DTED level 1 is free (beer) under the oft-mentioned FIAA, though it's not free (speech) as its distribution is controlled, IIRC.

    As for terrain feature data, there's ITD and VPF, but I'm pretty sure that's controlled, and neither format is very pretty. ITD is basically setup like a funky quasi-vector raster format (don't ask), and VPF is setup like a relational database. Yuck.

    Oh, the NIMA *has* put out some code, called NIMA-MUSE, which will read any NIMA format, though I've found it to be horrifically buggy. It can't even read DTED level-1. Actually, I couldn't figure out *how* to get it to read *any* DTED level. It read VPF adequately, though it was very slow (like, it took several minutes to read a small file on a P166), and I never got to try it on ITD. I'm pretty sure that the source for MUSE is free (speech) though.

  • -mp3's
    -check you email/internet
    -voice over modem/cellphone
    -gps/streets data
    -engine info/running stats/runtime tunning(saw someone working on universal car computer interface for Gnome.)
    -And best of all... pong for rush hour!

    I can't wait to pick up the first Auto distro.

    Two questions.

    *Is their a driver for a GPS unit/card for linux?(does such a thing exist?)

    *Will someone come out with a machine with at least 2 gigs of hd space. That fits in my radio slot. That will run a slightly striped down linux fast enough for the above mentioned things. For about 500 bucks.

    Then again maybe I'm just a sucker for any car with a script configurable, turbo boost button. ;-)

  • The data appears to be free as in speech, but in order to get a copy of it, one has to pay the government $1500. However the data is in the public domain so once you have acquired it, you can re-release it for free (in the beer sense) or you can release it under whatever terms you want (assuming you can get someone to buy it).

    So this leads to the question, if there is goverment data that people care about available, can we start a fund to buy it and then release it in the GPL. For instance I would pay $20 for a copy of the data, $10 of which might cover the copying of the CDs, labor and shipping, leaving $10 to go towards the $1500. So if I can get 150 of my closest friends to join in and collaborate then we can all buy a copy from the government and have our own copy of the 6 CDs.

    Perhaps someone* should start a website with interesting goverment data links and allow people to say how much they would pay for the data, and once a certain threshold is hit (say 150% of the price to cover people who change their minds) it would mail everyone for confirmation, they could submit payment and the product could be ordered, duplicated and mailed out. If there was a surplus then a fund could be started to put the extra in so that it could cover unforseen costs or maintenance of the hardware or whatever.

    *note that someone is not currently me since I don't have the time or attention to detail that this would require.

  • Is the data in the preferred form for modification? I rather suspect that the preferred form is a database of some sort.
  • The UNAVCO (University Navstar Consortium) has prepared a programme package TEQC (Translation, Editing, Quality Control) for reading and formatting to RINEX (the open standard for geodetic GPS data) from a considerable number of both geodetic and simpler GPS receivers. It can read both stored data files and input from the serial port.

    This software is (currently) binary only, but free and Linux is one of the primary platforms supported.

    Here [].
  • The value of releasing under the GPL seems to be that any application using the data set must also be GPL'd. There is no such restriction if the data set is public domain.

    As the article said, the purpose here is to encourage the development of high quality GPL'd mapping and navigation programs.
  • So if he does not add anything, but merely passes on the data that he recieved which (assume for the purpose of this discussion) is public domain, could one simply use the orignial copyright? Suppose he does add something, if I were to reomve what he added, could I use the data under the orrignal public domain terms?
  • I don't know what format the data is in, but assume for the moment that the format is well known, and that the specification of the format is not GPLd. Then anyone can write a program that munges data in that format. They can release this program under any liscense they like since they have not created a work that is a derivative of a GPLd work. In fact the authors of the program may never even have a copy of the data base.

    On the other hand if the data in the database is what is GPLd(assume for the momemnt it can be) then derivative works would amount to translations into other formats, and corrections. As much as I am not a fan of the GPL I think in this case it is a good thing because it means that if someone makes corrections to the database, and releases it we all can get it.

    So we all would get a free database, and people would make money by adding value by writing a program(which as I outline above would be under any liscense they choose) to do something with the data.
  • If you examine the ACLU site you will see at:
    If your request is not for "commercial use," you will only pay the search and duplication costs

    If your request is on behalf of "an educational or non-commercial scientific institution" or as a "representative of the news media," you will only pay duplication costs. Any person or organization which regularly publishes or gives out information to the public can be considered as "news media." Many public service organizations, therefore, meet this definition.

    and that

    Sometimes an agency will waive the fee. This will happen on a case-by-case basis if the request is considered to be in the public interest, which means the information will significantly help the public understand the operations or activities of the government agency.

    Perhaps he should have applied for a wavier of the fee. If this was his purpose in getting it.

    Also $1500 sound a little excessive for duplication charges for what amounts to 6 CDROMs. Hell at todays prices you could buy them one of those cheap sub $1000 PCs we've all be hearing about, add a CDROM burner, and a network card to hook up to their netowork, pay for electricity and $25 an hour for a person to run it all and still be under $1500 for burning 6 CDROMs.
  • Bruce, I can't thank you enough for sticking with the whole open source thing. And putting your money where your mouth is- amazing.

    But wouldn't the opencontent license [] be more appropriate for the data? I mean, there's no binary and whatnot that the GPL referrs to.

    Thanks again.


  • From the FAQ (Question 12):

    ... snip ...

    No, the TIGER data and the maps made from the TIGER Map Service are not copyrighted. The data used comes from the Census Bureau, an agency of the U.S. Government, and is in the public domain

    ... snip ...
  • I agree, why not just put the thing up on a web site for anyone?

    "Is Tiger/LINE public domain?"

    I just went to the Tiger web site and read their FAQ, all data from the Tiger database is public domain. However, is it possible for anyone to put their copy of the database under a copyright? I thought that even if something was in public domain, only the owner of that thing could copyright it, but I'm not a lawyer so I may be way off on this. Anyone know?
  • the groups mentioned already have their hands full. I'd rather see a new group formed to make a GPL or "free" GIS system for linux. Most GIS systems that are commercially available are overpriced (due to the factt hat they are marketed for Municipalities only, government has gobs of money, let's steal it from them attitudes) and almost nothing is available for Unix/linux. If we can get a useable GIS system designed that can use this dataset or something close for poor local governments/people that would bring linux farther into the fight.

    Let's get a fresh group toghether for that, and let the busy groups finish their already hectic jobs.
  • Q12: Is the TIGER data or are the maps that I make with the TIGER Map Service copyrighted?

    No, the TIGER data and the maps made from the TIGER Map Service are not
    copyrighted. The data used comes from the Census Bureau, an agency of the
    U.S. Government, and is in the public domain. In fact, many of our products
    are resold. Vendors take the basic product, add value to it (snazzier
    interface, more data, etc.) and sell it. The many PC street mapping packages
    available attest to that. We do however have trademarks on a number of our
    TIGER-related product names.
  • Someone wrote: "I'd rather see a new group formed to make a GPL or "free" GIS system for linux. Most GIS systems that are commercially available are overpriced (due to the factt hat they are marketed for Municipalities only, government has gobs of money, let's steal it from them attitudes) and almost nothing is available for Unix/linux..."

    Well, there is a very, very, very powerful GIS application developed for our beloved OS - it's the GRASS GIS system. Check it out at and see for yourself. It has way too many features to name here, but it was developed by the Army Core of Engineers and put into public domain. The potential for this system is huge, and more developers could be used to make GUI frontends, etc.

  • Being one who has been doing this for a good while, I just wanted to throw out some links for people who are inspired. I am into code, not talk, so don't be asses.

    shapelib by Frank Warmerdam. A lib to read SHP and DBF's associated. Beware, the SHX file is required as well. I have found that there is a mem leak that doesn't release on file close. Hrmph.. so I wrote my own file format.

    gd by Tom Boutell.
    Currently at Ver 1.3. However, be sure you have no overlapping points with the Solid polygon code. Causes an odd on the sort, so the scanline isn't drawn. I've got a better version as well. The only reason for the Ver1.3 was to change the lwz compression to run-length encoding. *blah*

    another good link is:

    which has a fully functional mapserver, very simular to mine. Basic problems are speed and bandwidth. It doesn't really "clip", but instead relies on ordered binary trees and such. This needs some work, but is workable.

    Anyhow, upwards and onwards.. let me know if I can be on any help.

  • Does anyone know if there's any freely available
    cartographic data for the world (as opposed to
    the US)? I'm trying to find a free vector-based
    map of countries and states of the world, but
    haven't had any luck.

    Any ideas?
  • Interestingly, the TIGER data forms the basis for a *lot* of data being sold by big companies. Most street level data of the US is derived from TIGER and then cleaned up/enhanced by those companies.

    There was a thread on comp.infosystems.gis a while ago about starting a free data movement. And don't foget, somewhere in the past there was (don't know if it still exists) an "Open Content" movement.

    There's a nifty worldwide data set called VMAP [] (formerly Digital Chart of the World) on 4 CDs available from the USGS [].

    But back at the ranch, there are some Open Source (tm) [] mapping systems, none of which handle TIGER as far as I know. In fact, OpenMap(tm) [] is one that we released just before Christmas. I'd invite the OS community to dive in and write an OpenMap LayerBean that handles TIGER data. OpenMap does handle VMAP but there's always room for improvement...
  • VMAP from the USGS [] is about $100. (I.e. distribution cost. You can give it away once you buy it).
  • Then there's OpenMap(tm) [] from BBN. We released it last month.
  • Check out OpenMap(tm) ( [] from BBN. It's all Java and Open Source(tm). []
  • If he's purchased the data set, then I would imagine that gives Bruce the rights to put it under pretty much any license he wants - assuming there were no special clauses in the purchase agreement. If you buy the rights to some software (not just a end-user license) you should be able to redistribute the software under any license you desire.
  • I've written for simulation purpose a GPS Simulator wich requires YUMA format almanach files to work. I've also written a trajectory simulator (WGS based) which derives a position, speed and attitude from initial conditions (Position, Speed, Acceleration and Attitude).

    I though this migth interest you (in order to test you program worlwide). The program is C++ and uses a propriatary matrix library (I cannot release it under a GPL like license in its current form but I've planned to do so if it's of interest to so).
  • are that the product does look somewhat copyrighted. If I read the site correctly.

    So how is it that someone can go about taking what appears to be a proprietary, copyrighted product, re-release it verbatim, and call it GPL'd? (My big question.) I know I'd be a little pissed If I found one of my peices of software that I have not GPL'd that someone went and "released" to the public floating around...

    Unless of course the fee paid for the product include reproduction rights, but I doubt it.

    Or, (I say, admitting my ignorance of many US laws) is it just the fact that the data comes from the government that makes this information public domain? (freedom of info. act?)

    Enlighten me.

  • ...Does not hold a Copyright of any kind...

    But doesn't this still prevent a third party (ie: Mr. Perens) from re-releasing the data under a more restrictive licence?

    (The GPL being more restrictive than no licence at all.)

  • I think it would be great if they distributed a database with Canadian geographical infomation.

Things equal to nothing else are equal to each other.