Glossary of Terms

Access Control Systems
Access Control Systems are systems that manage physical access to Harvard-owned properties, structures, or services.

Permission to access resources in a digital domain (after positive authentication)

Authorization Proxy Service (AuthZProxy)
Service provided by Harvard that allows applications to check the status of users prior to allowing system access.

Confidential Information
Information about a person or an entity that, if disclosed, could reasonably be expected to place the person or the entity at risk of criminal or civil liability, or to be damaging to financial standing, employability, reputation or other interests.

Harvard is bound by laws, such as FERPA and HIPAA, and by contracts, such as some grants and vendor contracts, to protect some types of confidential information. Additionally, Harvard, under University, School or unit policies, requires protection of other kinds of information about the University or Schools, faculties, departments and other units and about Harvard property (tangible or intangible). Confidential Information also includes High- Risk Confidential Information, as defined below, as well as other non-public personally identifiable information about individuals.

Nothing in Harvard’s policy on Confidential Information is intended to restrict or limit in any way employees’ rights to discuss terms and conditions of their employment with each other or with third parties.  Harvard’s policy is intended to protect Confidential Information, including confidential personnel information, from disclosure.

High-Risk Confidential Information (HRCI)
HRCI is personally identifiable information whose confidentiality is governed by law. High-Risk Confidential Information includes a person's name in conjunction with the person's Social Security, credit or debit card, individual financial account, driver's license, state ID, passport number or visa, or a name in conjunction with biometric information about the named individual. High-Risk Confidential Information also includes personally identifiable human subject information and medical information. Improper access to, use of or release of High-Risk Confidential Information may trigger legal reporting requirements. Such information is subject to legal requirements when being disposed of.

Examples of Confidential Information (in addition to HRCI) include the following: unpublished University financial information and development plans, salary information, employee benefits and other HR information (but employees may discuss terms and conditions of their employment, including salary and benefits, with each other or with third parties), grades and other non-directory education records, financial information about applicants, non-public personal and financial data about donors, Harvard identification numbers, information received under grants and contracts subject to confidentiality requirements, information on facilities security systems, unpublished research data, invention disclosures and patent applications, and information specifically designated as private or confidential.

Contract Rider
Text approved by Harvard counsel to be appended to Harvard contracts in which the vendor is working with Harvard confidential information. The contract riders specify the protections that must be implemented and the criteria that must be met in order for a vendor to work with Harvard confidential information.

De-Identified Data
Information that can be used to identify individuals either directly or indirectly must be removed. For information to be de-identified under HIPAA, 18 separate identifiers must be removed from the individual's record before that information can be considered de-identified. Covered entities have the option of stripping fewer identifiers from individual records but only if an expert with knowledge of statistical and scientific principles and methods assures that individuals will not be identifiable from the disclosed data or by comparison of the data with other sources of information.

De-Identified Research Data Set
A Research Data Set where all personal identifiers have been removed (and normally replaced by a random identity key) such that no personally identifiable data remains.

The algorithmic transformation of a data set to an unrecognizable form using an encryption key. The original data set or any part thereof can be recovered only with knowledge of a secret decryption key.

Family Educational Rights and Privacy Act; a federal law that requires protecting the privacy of student records.  Learn more about the Family Educational Rights and Privacy Act.

FERPA also gives a student the right to block public display of directory information. Schools are required to convey to students the information they classify as directory information, and to allow students and parents a reasonable amount of time to request that the School not disclose directory information about them. This request is referred to at Harvard as a FERPA block.

High Risk Confidential Information (HRCI)
High-Risk Confidential Information includes a person's name in conjunction with the person's Social Security, credit or debit card, individual financial account, driver's license, state ID, or passport number, or a name in conjunction with biometric information about the named individual. High-risk confidential information also includes human subject information and personally identifiable medical information. Improper access to or release of high-risk confidential information may be subject to legal reporting requirements. Such information is subject to legal requirements when being disposed of.

Identity Key
The code used in place of Personal Identifier(s) in a Research Data Set.

Identity-Mapping File
Data set that can be used to associate identity keys with individuals.

IRB Application
The research application submitted to the local IRB for review and approval.

Mass Email message (or Broadcast Email message)
Sending of an electronic communication to a campus-wide or ad hoc group of individuals across multiple schools or administrative units.

Limited Data Set
A limited data set contains more information about individuals than de-identified data. A limited data set permits use of some identifiable health information, while excluding direct identifiers. This type of disclosure requires a Data Use Agreement between the researcher and the covered entity that establishes the permitted uses and disclosures of the data set.

Non de-identified data set
Data set that contains personally identifiable data. Not all data sets can be reasonably de-identified (for example, an audio recorded interview in which a subject identifies him or herself, or a videotape that includes images of subject’s face). In this case, the data set must be considered a non de-identified data set.

Payment Card Industry Standards (PCI)
The Payment Card Industry Data Security Standard (PCI DSS) is an information security standard for organizations that handle cardholder information for the major debit, credit, prepaid, ATM, and POS cards.

Defined by the Payment Card Industry Security Standards Council, the standard was created to increase controls around cardholder data to reduce credit card fraud via its exposure.

Personal identifiers
Any data elements within a data set that singly or in combination can uniquely identify an individual, such as a social security number, name, address, birth date, physical characteristics, demographic information (e.g. combining gender, race, occupation, and location), hospital-patient numbers) or history.

Personally identifiable data
Data that are associated with living persons, or that can be associated with living persons by deduction from personal identifiers in a data set.

Research Data Set
A body of data elements collected or used in the course of research.

Risk Management and Audit Services, Harvard University's internal audit group.

Secure location
A place (room, file cabinet, etc.) to which only the Principal (or lead) investigator, and any specifically-approved other individuals, has access through lock and key. Either physical or electronic keys are acceptable.

Sensitive Data
Any data that can be linked to individual subjects involving medical information, personal financial information, social security numbers, and any information the disclosure of which outside the research could reasonably place the subjects at risk of criminal or civil liability or be damaging to the subjects' financial standing, employability, insurability or reputation. (Expanded from 45 CFR 46.101(b)(2)(ii).) Any data concerning Harvard students should be considered Sensitive Data.

University LDAP Enterprise Directory (Attribute Service)
Harvard's University LDAP directory acts as an official University attribute authority. It contains profile data about HUID holders, and to a much lesser extent, for XID holders.

University PIN system (Authentication Service)
Provides authentication services for populations that hold Harvard ID numbers (students, faculty, staff, some affiliates).

XID system (integrated with University PIN System)
Allows non Harvard ID holders to register for this other type of ID number that can be used for authentication with University PIN-enabled applications


N.B. Hospitals and other health care providers, as well as health insurance companies, are held to a very stringent standard for de-identifying data under the Health Insurance Portability and Accountability Act of 1996 (HIPAA). To be considered de-identified for HIPAA purposes, an expert in statistics would need to conclude that disclosure of the information in a particular data set presented a very small risk that the information could be “used, alone or in combination with other reasonably available information, by an anticipated recipient to identify an individual who is the subject of the information.” 45 CFR 164.514(b)(1)(i). In addition, under HIPAA a de-identified data set must be stripped of certain enumerated elements. Harvard researchers are not held to the same standard when de-identifying data. However, in creating a de-identified data set, one can consider the HIPAA elements. They are: names; street address; city; county; precinct; zip codes and their equivalent geocodes, except one may include the first 3 digits of a zip code unless fewer than 20,000 people reside within such zip code; dates related to birth, hospital admission or discharge, death, and all ages over 89 years old; telephone, fax, social security, vehicle identification, and license numbers; email addresses, health plan, medical record or account numbers, device identifiers and serial numbers; URLs or IP addresses; biometric identifiers such as finger prints; full face photos or comparable images; and any other unique identifying number other than code numbers assigned for the research, such as in an Identity-Mapping File. 45 CFR 164.514(b)(2).