Types of Queries
· Writing Your Own Queries
· "Canned" Queries and Reports
· Guidelines for Sharing Queries
General Guidelines on Executing Queries
· Information Restrictions
· Guidelines for Writing Queries
· Verifying Query Results
· Tips for Querying with
Writing Your Own Queries
When writing your own query, you define the set of choices and conditions
that you want to use to retrieve data stored in the Warehouse. You determine
such attributes as data elements to be returned (e.g., last name, city,
age), conditions for selecting records (e.g., equal to or less than),
and sort criteria (the order and priority in which results are to be sorted).
You may want to keep in mind some general questions to guide you in composing
Refer to the Guidelines for Writing Queries below for
more detailed examples. Also, for more information about considerations
particular to the General Ledger data collection, refer to the collection online tutorial. (A similar page also exists for the Salary Management data
- What information are you looking for? In what collection does this
data reside, and which Business Object universe would be best to use?
- Bear in mind the level of detail data you need, the time periods concerned,
and which source system you would use to verify the data retrieved.
- Once you have a basic idea of the results you need, consider how the
query should be contrained -- by time period? account segment(s)? employee
or organization names/codes?
- What will you do with your results? If you are presenting them to
others, you may want to include segment descriptions for those unfamiliar
with codes. Also, if you plan to export the data, you may want to include
objects which you have used to constrain your query, to better identify
the data in the exported file. (For example, although you may have used
Accounting_Period as a constraint, it might help to have the period
appear as a column in your exported file, so you know what period that
Be aware that University data is sensitive and must be handled responsibly.
Note that some data may be restricted by school or by department or cannot
be released to persons outside the University. Persons who work with student
data should also be aware of student rights as stated in the Buckley Amendment.
More information about restrictions on University data is available in
General Guidelines for Executing Queries
Queries often require complex conditions to return appropriate results.
WHERE clauses (also known as query filters or conditions) reduce the amount of data to be processed
in a SELECT statement (comprised of desired result objects or columns)
by specifying that only those rows meeting the criteria in the WHERE clause
are displayed. Depending upon which tool you use to query the Data Warehouse,
your degree of control over the query language and operators will vary.
For example, BusinessObjects and Microsoft Access employ graphical interfaces
which construct appropriate SQL behind the scenes based on the tables
and joins the user indicates. Oracle SQL*Plus, on the other hand, requires
the user to write his or her own SQL statments. The Oracle database system,
which forms the foundation of the Data Warehouse, permits the use of powerful
SQL operators, some of which may also be available as post-query operators
in your desktop tool.
The following guidelines are for all users of the Data Warehouse, regardless
of query tool, and provide good practices for efficient querying:
· Refer to the help documentation. The main difficulty
most people have with writing queries is knowing which table to use. If
you are unsure about which table to use, refer to the table help that
is available for the data collection. Pay particular attention to the
sections "Common Uses" and "Cautions." These sections may help you decide
if you have selected the appropriate table for your query. If you are
unsure about a data element, look up its definition. The Office of Data
Administration has included data element definitions and table help as
part of the data collection documentation on the web.
· Take advantage of indexes. If possible, include
an indexed data element in your condition statement. A query with a record
selection condition using an indexed data element tells the system to
go directly to the rows in the table that contain the value indicated
and to stop retrieving data when the value is no longer found. If a query
does not select records based on an indexed data element in its record
selection condition, the system starts searching at the first row in the
table and works through every row until it reaches the last row in the
table. Indexed columns are noted in each collection's documentation.
Certain operators or query segments are processed by the system without
the use of indexes, even if the column in the condition is indexed. It
may, of course, be necessary for you to construct your query in this manner
to retrieve correct results, but in considering alternatives in query
construction you may wish to keep in mind the following situations
where indexes may not be used:
Negative comparisons such as Not Equal (represented by
=! or <> in SQL), Different From, or Not In. Avoid negative phrasing of condition
statements as much as possible. In general, it is easier (both for the
system and for you) to interpret a positive phrase than a negative phrase.
For example, instead of the condition statement "If term is not greater
than 1998A," rephrase the statement to "If term is less than or equal
to 1998A." Or, if practical, eliminate the condition from the query and
filter your results on the desktop.
Nulls such as Is Null or Is Not Null.
Like or Matches Pattern comparison with a date or number column.
For example, to retrieve employee payments from March (of any year) use
"FISCAL_MONTH_SEQ = '09'" rather than "CHECK_DATE Like 03/%".
Wildcards at the beginning of a string. Avoid matching
patterns beginning with a wildcard (Like %...). A wildcard at the end
of a pattern is definitely appropriate and can be very efficient (e.g.,
Where Fund Like 5% will retrieve all Funds 500000 - 599999).
Indexed columns modified by an expression or function (e.g.,
rather than concatenating all Chart of Account segments as COA_CNAC||COA_ORG||COA_BC||COA_FUND||COA_OBJECT||COA_PROGRAM||COA_CREF,
select the column COA_ACCOUNT, which is indexed). Also, comparing
an indexed column to another indexed column using Greater Than (>)
or Less Than (<).
· Check the "and/or" qualifiers in the records selection
criteria of the condition statement. For example, a query coded to get
students with the following conditions statement will actually return
every student in COL for 1998A and every student for 1998C regardless
of the division:
If division is equal to 'COL' and term is equal to '1998A' or
term is equal to '1998C'
The query coded to get students with the following conditions statement
will return every student in COL for 1998A and 1998C:
If division is equal to 'COL' and (term is equal to '1998A'
or term is equal to '1998C')
· If your access to data is restricted, do not force
the security system to select records for you. For example, if you are
authorized to access data only from a particular department, one of your
record selection conditions should state "If Organization='My Organization',"
where organization is the code for your department.
· Review your query before executing it. Check to
make sure that your query is as precise as possible. This includes selecting
the tables that will give the best results, reviewing selection conditions
and sort criteria, and if it makes sense to do so, including at least
one indexed data element in the conditions statement. For example, if
you want to find all undergraduate freshmen and their names, choose the
Person table rather than the ADDRESS table. This is because a student
can have multiple addresses, and choosing the ADDRESS table would return
a name for each address the student has listed.
· Be aware of data that is subject to change and
its effect on your results. For example, a grade change can affect a student's
grade point average (GPA). A query executed before and after the grade
change may or may not result in a changed GPA. In addition, keep in mind
that there is a "data delay" between Warehouse collections and their respective
source systems. Refresh schedules are noted in each collection's documentation.
· Give the query time to execute. Queries can take
many minutes to execute; complex queries can take longer. It is not uncommon
for a query to take 5 to 10 minutes to complete. In general, let the query
run until it finishes. If the query takes longer than 1 hour to complete,
contact Data Administration.
· Note the date and time of the query when creating
reports or communicating results to others.
Verifying Query Results
Although query tools make accessing data easier, remember that University
data is complex. You can easily compose a "bad query." That is, you may
have selected incompatible data elements, the wrong data elements, or
structured the conditions in your query such that it returns improper
results. Be aware that it is possible to get an incorrect answer or misinterpret
the data. For example, when mailing information to students, it is more
preferable to select "current address" rather than "temporary address"
as the address type. This is because the system looks at the expiration
dates for the temporary and local addresses and computes which address
is the most current.
To minimize the risk of misinterpreting data or getting the wrong answer:
· Compare your results with known reports that are
up-to-date and accurate.
· Perform "sanity checking." Ask yourself if the
information you have is a reasonable answer. For example, few classes
at Penn enroll more than 100 students per class so a query returning a
class enrollment of 250 students is questionable; grants are usually less
than Penn's overall budget thus a grant amount that is listed as twice
its Penn budget is unlikely; and Penn's student population is approximately
50% men and 50% women so a query returning a class enrollment of 98% males
is probably incorrect.
· Consider estimating likely parameters for a reasonable
response before executing the query.
· Try sequential queries that break large sorts
down to components, then add them to be sure the whole is accounted for.
For example, if searching for Wharton undergraduate Asian female students,
look at: all Asian students, male Asian students, etc. to check the reliability
of your query.
· Use one or two conditions and zero in on what you
want. When developing your query to answer a question, start by retrieving
detail information, with just one or two record selection conditions.
Zero in on what you want, looking at how the results change as you add
each condition. If your goal is a summarized report, see how the detail
report compares to the summarized one. Make sure the query includes what
you want, all of what you want, and nothing but what you want.
· Ask someone who knows the data to check your results.
Ask someone in your department or school, or contact Data Administration. Also, ask the person requesting the report
what results they expect to get. Or ask if the person can suggest another
existing report that you can use to check to see if your results are at
least in the ballpark. If your query gets results that are nowhere near
what the requester expected, your query may be in error.
· Continue to become familiar with and knowledgeable
about University data by participating in the data collection listservs.
· Just look at the report. Sometimes, just looking
at the query results can help you spot glaring errors. (Why am I getting
so many rows returned? Whoops--I forgot to screen on accounting period!)
· Run a similiar query. Try running another query
that approaches the question differently, and see if both queries get
the same results. For example, if you want figures on telephone charges,
in one query, you could set a condition on a list of object codes, and
in another query, you could set a condition on the parent object code.
If the queries do not get the same results, find out why not.
· Check other reliable sources. Check other sources
of information, such as BEN Financials or other source system screens
or reports, to see how they compare with your query results.
"Canned" Queries and Reports
· financial queries
Canned queries, sometimes known as corporate documents, are predefined queries. In most instances, canned queries
contain prompts that allow you to customize the query for your specific
needs. For example, a prompt may ask you for a School, department, term,
or section ID. In this instance you would enter the name of the School,
department or term, and the query will retrieve the specified data from
The main reason for using a canned query or report rather than creating
your own is that your chances of misinterpreting data or getting the wrong
answer are reduced. Someone who is knowledgeable about University data
has correctly identified the conditions and criteria needed to execute
a successful query. You are assured of getting the right data and the
Whenever possible, canned queries and reports are provided to give you
standardized perspectives on Warehouse data. These types of queries vary
with the query tool and the data collection. Thus, before creating your
own query, check to see if a canned query or report already exists for
the information you want.
For a list of canned Business Objects reports using financial data,
To access Business Objects corporate documents:
- Point your browser to InfoView login page (https://bobjprod.isc-seo.upenn.edu/InfoViewApp/logon.jsp)
- Enter your Business Objects User Name and Password. (You can make sure your Data Warehouse and Business Objects account passwords are synchronized by using the password change application.
- Once you've successfully logged in to InfoView, navigate to the Document List. You can choose to browse documents by viewing them by Folder or Category. The Folder view appears by default - all the shared documents are available under Public Folders. Although you'll see all the categories available, you'll only see documents whose data you're authorized to refresh. You can also use the search option to search for a report by name.
- Click on a category or folder name to view the documents available to you. The list of documents appears in the right-hand pane of the InfoView window, and displays the documents by name, with description (if available)..
- Simply double-clicking clicking the document name will open it.
- Keep in mind that documents in the repository are saved without any data, so that you can click the Refresh Data link at the upper right corner of the screen to refresh the report according to your security and any parameters.
- If the report includes prompts, you will be presented with a screen asking you to fill in those prompts, and then click the Run Query button to execute the report.
Guidelines for Sharing Queries
If you would like to contribute queries to the shared repository, or
have questions about sharing queries, please contact Data Administration. You may also share queries with other users. In preparing a query for the repository,
please follow these guidelines:
- Use prompts with your conditions to keep the query as generic as possible,
and to take advantage of the Data Warehouse built-in security. (For
example, COA Org = [prompt], rather than a hard-coded value.)
- Purge the data from your query before sending, so that others will
not be able to see your confidential data.
- Prepare some brief accompanying documentation for your query so that
Data Administration can make it generally available. Some helpful information
might be suggested use, any special timing issues or cautions, a list
of all result objects and conditions, and the logic for any report variables.
Please suggest a source for verification of results, if possible.