National Center for Supercomputing Applications (NCSA)
One of the large projects the National Center for Supercomputing
Applications is involved in is TeraGrid, a distributed network of
high-performance computers that is used for parallel computations to
solve large-scale scientific problem. Michael Shapiro used DbVisualizer
to reverse engineer and visualize a TeraGrid database that he assumed
responsibility for in his role as Senior Systems Engineer at NCSA.
How do you figure out how complex databases are structured when you get
the responsibility for extending functionality in systems that you didn’t
originally design yourself? That was an important question for Michael Shapiro
when he joined the TeraGrid project at the National Center for Supercomputing
Applications (NCSA) some eight years ago. TeraGrid is a distributed
supercomputing resource that combines the computational power of high-performance
hardware spread over 11 sites all over the US, and Michael’s database keeps
track of projects, individual usage and proposals for new projects to run
on TeraGrid.
"I was given responsibility for the TeraGrid central database, and one
challenge was to understand it so I could continue to develop it.
One tool that I thought would help was one that could produce an ER diagram
of the database. I looked around and found DbVisualizer. DbVisualizer
enables me to reverse engineer to produce ER diagrams. It also has some
other features like the Navigator that I can set up by just clicking
that looked very useful," said Michael Shapiro, Senior Systems Engineer
at NCSA.
The database that tracks all the science projects that run on TeraGrid
is developed on Postgres. Usage information for all TeraGrid resources is
reported by all the participating sites and entered into this central
database. The database is hosted in San Diego, with development and
maintenance at NCSA in Illinois.
"I searched the web for a tool that could reverse engineer a Postgres
database. A lot of people were recommending DbVisualizer, and it was a
lot easier to use than other products I found and tried."
Michael Shapiro, Senior Systems Engineer, National Center for
Supercomputing Applications
"The ability to reverse engineer and visualize an existing database may
appear to be a pretty narrow use for a tool, but it was important to me
for initially exploring the database, as well as communicating the
structure to other members of the teams involved in the project. I am
convinced that there are similar scenarios in most large organizations
that rely on complex databases. In fact, I think this is an area where
even more functionality could be warranted," said Michael Shapiro.
Now that he is several years into the job, Michael Shapiro and the team
he is on have put a significant mark on this database. DbVisualizer was
important to understand the initial design that was done before he took
over the responsibility for the further development of this database.
About NCSA
The National Center for Supercomputing Applications (NCSA), located at
the University of Illinois at Urbana-Champaign,
provides powerful computers and expert support that help
thousands of scientists and engineers across the country improve our world.
With the computing power available at NCSA, researchers simulate how
galaxies collide and merge, how proteins fold and how molecules move
through the wall of a cell, how tornadoes and hurricanes form, and
other complex natural and engineered phenomena.
NCSA was established in 1986 as one of the original sites of the
National Science Foundation's
Supercomputer Centers Program. For more than 20 years, NCSA has been
a leader in deploying robust high-performance computing resources and
in working with research communities to develop new computing and
software technologies.
TeraGrid is a distributed, massively parallel supercomputing structure,
combining resources at eleven partner sites to create an integrated,
persistent computational resource.
Using high-performance network connections, the TeraGrid integrates
high-performance computers, data resources and tools, and high-end
experimental facilities across the US. Currently, TeraGrid resources
include more than a petaflop of computing capability and more than 30
petabytes of online and archival data storage, with rapid access and
retrieval over high-performance networks. Researchers can also access
more than 100 discipline-specific databases. The TeraGrid is the world's
largest, most comprehensive distributed cyberinfrastructure for open
scientific research.
TeraGrid is coordinated through the Grid Infrastructure Group (GIG) at
the University of Chicago in partnership with Indiana University, the
Louisiana Optical Network Initiative, National Center for Supercomputing
Applications, the National Institute for Computational Sciences,
Oak Ridge National Laboratory, Pittsburgh Supercomputing Center,
Purdue University, San Diego Supercomputer Center,
Texas Advanced Computing Center, and
University of Chicago/Argonne National Laboratory, and the
National Center for Atmospheric Research.