When more data can mean more fun
2 December 2002Tomorrow's spacecraft will be capable of generating more data than they can transmit to Earth. In some cases, this could be more data than can even be comfortably handled by today's computational methods. What benefits are there for us in this flood of data?
If you know how to transfer huge quantities of data, you could revolutionise some Earthly applications. In the entertainment industry, you could transmit films via satellite to waiting cinemas. Since the information is digital, audiences would see a perfect picture every time. Film distributors would no longer need endless rolls of celluloid film. The menu at cinemas would not be limited to feature films either. You could beam sporting events, musical concerts, and even news reports into cinemas, showing them live.
ESA scientists may have less fun with the challenges of transferring bulk data from space. In 2008, for example, ESA's Eddington will study 'starquakes' and search for planets, generating a 70-megabyte image every few seconds. However, the data link to Earth runs several hundred times slower, at just 130 kilobytes per second. Fabio Favata, project scientist for Eddington, has an ace up his sleeve. "We know which stars we want to observe," he says. On-board computers can send back only the information relating to the stars, not the black sky in between. This avoids unnecessary transfer.
Sometimes astronomers need information about the whole sky, not just about the pinpointed stars. This is the problem facing ESA's scientists in the Planck mission. Planck will survey the whole sky, mapping the leftover radiation from the Big Bang. Jan Tauber, Planck's project scientist says, "We have to retrieve our information in a smart way." Planck will compare the data with a computer prediction and send back only the differences between the two, thereby transmitting smaller numbers, which you can send faster. On Earth, the same computer prediction reconstructs the full data record.
Around 2010, another ESA mission, Gaia, will have to work out how to manage very large amounts of data. Engineers designed Gaia to discover new objects as well as collect data about known ones. To cut down its data stream, its on-board software will detect every object that enters the spacecraft's field of view. After that, it defines a small area around the object, and transmits data from that area only. Data compression software reduces the size by a factor of five also.
Gaia will generate a staggering amount of usable data, that is, 1 petabyte (one thousand million million bytes) that scientists need to search and process. Even if you could search an individual data record each second, searching all the records could easily take 30 years. Michael Perryman, Gaia's project scientist admits, "Clearly we have to set up a system that will handle this amount of data in sensible times." The Gaia team are working with commercial software producers to construct one of the most sophisticated, indexed databases in history.
Once such a database is developed, we could have huge benefits in Earthly applications. Why? Since the Internet itself is one huge database, Gaia's advanced techniques could translate into better, faster Internet search engines.
At the turn of the third millennium, the Human Genome Project, to map the genetic code of a human being, had generated 25 gigabytes (a thousand million bytes) of information. At one petabyte, Gaia's database will be 40 000 times as large.