
Who is Eligible?
The Student Data Portal is open to all Penn undergraduate, graduate, and doctoral students who will use the data for academic purposes – capstone and course projects, PhD research, and independent study.
Available Datasets
Annalect
AIAB is delighted to provide a unique and comprehensive dataset from Annalect, the data management division of the Omnicom Group, a leading global advertising, and marketing communications services company. This dataset includes exposures to email and online display advertisements from a travel business company, as well as conversions at the company’s website. Researchers will be able to track exposures, clicks, and conversions for 10,000 individual users (tracked by cookies) for ~60 days. As tourism consumers typically shop over the course of several weeks, this gives researchers the opportunity to explore how customers search for information about a highly considered product and how advertising affects the path to purchase. The dataset includes:
- Details about the exposure, including the type of ad, description, and size of the creative, and the campaign the creative was part of
- Information about whether the user clicked on the ad, and if that click eventually led to a conversion
- The type of conversion the user engaged in, such as exploring products or receiving a purchase confirmation
The Barnes Foundation
The Barnes Foundation is a world-renowned nonprofit cultural and educational institution committed to transforming lives through art by sharing its unparalleled art collection, exhibitions, classes, and public programs with the widest audience possible.
Data includes:
- Customer data
- 300K customers, including members and non-members
- Transactions
- All purchase points, product info, and purchase channel
- Historic product calendar and financial spreadsheets
- List of promotions for non-members and non-members
- Calendar of print mail campaigns
Clientivity
Clientivity is a hotel booking software platform that empowers users to create, manage, and earn commission from personal, group, and corporate travel. The dataset includes funnel statistics, partner and end-user demographics, and hotel pricing trends.
Data includes:
- 12,000 active partners
- 53,000 partnering hotels, including location, star rating, and review count
Coqovins
Coqovins is a virtual sommelier that makes personalized wine recommendations through a chatbot at participating wine stores. The dataset includes wine attributes, wine reviews, and wine details.
Data includes:
- 1,600 individual wine reviews
- 9,100 wine attributes
- 26,000 wine label details
Expedia
Expedia, the largest online travel company in the world, provided a dataset that details events leading up to conversion (or failure to convert) for approximately 10,000 U.S.-based users searching for hotels in each of four geographic markets (Cancun, NYC, Paris, and Budapest). The data includes information about how the user arrived at Expedia, what promotional pages they have viewed, details of their search query such as dates and number of travelers, which hotels were displayed in search results, which hotels were clicked on and which hotels were purchased.
Fuel Cycle/Rent-A-Center
Fuel Cycle is an all-in-one research platform that combines both qualitative and quantitative data to power real-time business decisions. Rent-A-Center stores offer name-brand furniture, electronics, appliances, computers, and smartphones through flexible rental purchase agreements that allow the customer to obtain ownership of the merchandise at the conclusion of an agreed upon rental period.
Data includes:
- Product performance data from Rent-A-Center
- Rental agreement and rent to own performance metrics for eight TV models
- Customer data, including demographics and customer status (new, active, reactivated)
- Rental agreements, including purchase amounts, discounts, and whether it was a single agreement or if the TV was packaged with other items
- Transactional level data associated with rental agreements, including product info, rate/price changes, whether the product was new or used, and sales channel
- Store information
- Survey data from Fuel Cycle
- Results from three separate surveys which collected data on specific TV models
Hachette Book Group
Hachette Book Group is a leading trade book publisher based in New York and a division of Hachette Livre (a Lagardère Company).
Dataset includes information for ~2,200 books that have generated significant traffic during a 12-month period:
- Sales data, including shipments, aggregated point of sales (weekly), and affiliate marketing sales data
- Social analytics data, including traffic from social media sites to website
- Analytics for website pages related to books, including clicks, demographics, and visitor counts
- Email campaign data
- Book product metadata, including book information, current price, page count, genre, and ISBN
- NPD BookScan (for sales data from competitors)
- Online ad stats
- Marketing spend/budgets
Hertz
AIAB is pleased to provide a unique and comprehensive dataset from the Hertz Corporation, a world leader in retail rental cars and equipment. This dataset includes employee engagement surveys linked to Hertz locations in the U.S. and Canada, transactions of rental cars in those locations and customer satisfaction surveys for those transactions. These data are longitudinal over a two-year window, providing opportunities for research from a variety of different angles. Studies of organizational behavior, customer loyalty and engagement, geographic retail transactions, up selling/add-on behavior, and customer segmentation are all possible in this rich and detailed dataset. The dataset includes:
- Over 68,000 responses to a semi-annual employee engagement survey
- Over 3,000 rental locations in U.S. and Canada, all uniquely identified across data
- Over 80,000 responses to a post-transaction customer satisfaction survey with detailed transaction data for the corresponding rental
International Gaming Company
A major sports video game franchise has provided AIAB with a dataset covering a three-year period, including annual releases of new versions and purchase incidences of virtual currency during that time.
The dataset includes:
- Records on approximately 60,000 players covering up to three years of player behavior
- Over 1.6 million unique game session records, including player ID, session duration, and game console used
- Over 46,000 purchase incidences, including player ID, game console used, and timestamp of purchase
Quick Service Restaurant Chain
AIAB is delighted to provide a dataset from an independent purchasing cooperative that serves as a supplier to a major quick service restaurant chain. This unique dataset includes individual transactions from approximately 2,300 restaurant locations across four geographic regions and contains all purchases made by 5,000 random individual customers over the course of two years. In addition to typical transaction data, students will also have access to detailed information about what products each customer purchased and customer survey results – allowing a comprehensive view of the product and service quality for each customer purchase.
The dataset includes:
- Franchise point of sale transactions, including details on which menu item(s) were purchased, quantities of each item, payment information, and any discounts/promotions applied to the order
- Metadata on specific restaurants, including open/close date, and store type (such as street store vs. food court storefront)
- Survey responses submitted by customers linked to individual restaurants
Reed Smith
Reed Smith is a dynamic international law firm, dedicated to helping clients move their businesses forward. The firm has more than 1,700 lawyers in 28 offices throughout the United States, Europe, the Middle East, and Asia.
Data includes:
- Timecards data over three years, including task descriptions and codes, hours worked, amount billed, and information about the attorney
- Legal matter data for 8,000-10,000 clients over three years, including types of work, tags, industry, and geography