A
- A/B Testing
- Analysis in which randomized target customers receive variations of an item, e.g. an application interface element or web page, to measure the effect on a desired outcome, such as conversions. Also called Split Testing or Multivariate Testing.
- Algorithm
- An automated series of steps that executes a mechanizable function, e.g. a mathematical function in a computer program
- Anomaly Detection
- Analysis in which the analyst is alerted when events occur that fall outside of an established norm. Examples include Fraud Detection and Bot Detection.
- Anonymization
- Removal of information about people from a database that could be used to identify individuals.
- Application Programming Interface (API)
- A set of commands provided for a software application that allows programmers to interact with it.
- Artificial Intelligence (AI)
- Software that carries out complex tasks that humans would normally be required to perform, such as pattern recognition and decision-making.
- Attribution
- Analysis that seeks to discover what prior event caused the event being analyzed.
- Average
- See Mean.
B
- Big Data
- Data that, because of volume or complexity, is beyond the processing capacity of ordinary analytics tools.
- Biometrics
- Data on the body of a user collected by digital tools designed for the purposes of measuring health or athletic performance.
- Bot
- A computer program that operates autonomously, to carry out tasks for a user and/or to mimic the behavior of a person.
- Bulk Data
- Data that is uploaded to storage all at once, e.g. historical data. In contrast with Streaming Data.
- Business Operations
- Management of tangible and intangible business assets with the intention of deriving more value from them.
C
- Clickstream Analytics
- Analysis of user behavior based on their clicks on web pages.
- Cloud
- Computer services that are remotely hosted and accessed through the internet.
- Clustering Analysis
- The identification of groups of events linked by close proximity to each other, e.g. for the purpose of anomaly detection.
- Click-Through Rate (CTR)
- The ratio of users who click on an item to the number of users who view that item.
- Columnar
- A database management system that stores data in columns rather than rows so as to speed query performance.
- Connector
- An API that enables a user to receive and/or send data to or from different computer programs.
- Content Management System (CMS)
- A computer program that provides a simplified interface for creating and editing pages and/or records on a website.
- Conversion
- A user action defined as a desired outcome to be measured, e.g. a purchase of a product or a signup for a membership.
- Cost Per Click (CPC)
- Advertising payment method that charges a certain amount every time a user clicks on an advertisement.
- Cost Per Thousand (CPM)
- Advertising payment method that charges a certain amount for every thousand views of an advertisement.
- Cross-Channel Analytics
- Analysis of behavioral data from multiple channels (e.g. online, mobile, in-store) to provide a more complete picture of customer preferences for purposes of targeting and promotion.
- Customer Data Platform (CDP)
- An analytics platform that centralizes First Party Data and enriches it with Second and/or Third Party Data for a more accurate and actionable view of customer behavior.
D
- Database
- A collection of information organized to allow efficient access, management, and retrieval.
- Data Management Platform
- An analytics platform that centralizes First Party Data for better tracking of advertising campaign performance.
- Data Model
- A conceptual or logical diagram of how data will need to flow in a computer program to fulfill its requirements.
- Data Silo
- A data source that is difficult to connect with other data because of dependency on Engineering resources or other constraints.
- Data Warehouse
- A central repository of business data for an organization.
F
- Federated Database
- A system in which multiple databases are linked together and can be interacted with as if they were a single database.
- First Party Data
- Information collected by a company about its own customers, from sources such as web and mobile usage tracking, CRMs and Business Analytics tools.
- Funnel Analysis
- Analysis of customer behavior in stages, typically starting with Awareness and ending with purchase or signup (Acquisition).
G
- Geolocation Data
- Device sensor data that tracks the physical location of a user.
- Governance
- Management of the quality, accessibility and security of data within an enterprise.
- Growth Hacking
- The use of rapid experimentation across marketing and product channels to drive growth.
- Growth Marketing
- See Growth Hacking.
I
- Ingestion
- The process of receiving data from a source and converting it into a format that can be accessed in a Data Warehouse.
- Internet of Things (IoT)
- The web of communication between devices equipped with internet connectivity (Smart Devices).
J
- Javascript Object Notation (JSON)
- A common data format consisting of strings of Key Value Pairs.
K
- Key Value Pair
- A common data format in which the data, or “Values”, are identified with text labels called “Keys”. Used for some NoSQL databases.
- Keyword
- A word used as Metadata to identify a web page or other content in a search.
L
- Live Data
- Data that is Connected with other relevant data, Current, and Easily Accessible to the people and processes in an organization that need to use it.
- Lock-In
- Inability to easily move data from one platform to another due to limitations imposed by a data storage provider, Business Application, or CRM.
- Log
- A text string that contains information about the state of a computer program or an event. Computer Data is composed of Logs.
- Lookalike Modeling
- Analytic process for targeting online advertising to website visitors with similar interests as a company’s existing customers.
М
- Machine Learning
- A category of Artificial Intelligence (AI) software in which the behavior of a program changes and improves based on exposure to new data.
- Map/Reduce
- A type of algorithm that breaks a data set apart so it can be processed on separate systems (Map) and then combines the data returned by those processes to create a report.
- Mean
- The sum of all the numbers in a set divided by the amount of numbers in the set. Also called the Average value of a set.
- Median
- The middle point of a number set, for which half the numbers in the set will be above and half below.
- Metadata
- Data that gives information about other data. On a web page, the Keywords for the page are metadata.
- Migration
- The process of moving a company’s existing data from one platform to another.
- Model
- See Data Model.
- MongoDB
- A popular open source NoSQL database.
- Multivariate Testing
- See A/B Testing.
- MySQL
- A popular open source SQL database.
N
- Normal Distribution
- A bell-shaped graph that displays how items are ranked according to a randomly-distributed variable.
- NoSQL
- A general term that describes databases with structures and rules that differ from the row-and-column based SQL format.
O
- Omnichannel Marketing
- A marketing strategy that uses data from different sales channels to drive sales and optimize customer experience across other channels. An example is using data about a customer’s in-store shopping behavior to provide targeted email offer coupons.
P
- Portability
- Ability to move data, especially large amounts of different data formats, from one platform to another.
- Predictive Analytics
- A marketing function that uses machine learning to extrapolate customer traits to predict future purchasing behavior.
- Presto
- A SQL-based query language that allows very fast queries of large and distributed databases.
R
- Recommendation Engine
- Software that uses data derived from customers’ purchase habits to determine products to recommend, as on an e-commerce site.
- Relational Database
- A database organized in tables of rows and columns.
- Return On Advertising Spend (ROAS)
- Advertising spend metric, expressed as a percentage of the revenue earned compared to the dollars spent on a campaign. Formula: Revenue / Cost x 100. If you spent $200 and made $300 on a campaign, your ROAS would be 300/200 * 100 = 150{5b4fe36bddf3b02e316fbf4108886dcb7ff194fea4a9c7a3e1d13675f87abf4b}.
- Return On Investment (ROI)
- Spend metric, expressed as a percentage of profit on an investment. Formula: Revenue – Cost / Cost. If If you spent $200 and made $300 on a campaign, your ROI would be 100/200 * 100 = 50{5b4fe36bddf3b02e316fbf4108886dcb7ff194fea4a9c7a3e1d13675f87abf4b}.
- Retargeting
- Online advertising in which ads are shown to users based on their browsing history.
S
- Scalability
- Ability of a platform to automatically scale to meet user needs. If a platform stops working or performance degrades once usage goes beyond a certain threshold, it is not scalable.
- Schema
- In computer programming, the structure of a database.
- Schema-Flexible
- A type of data processing that allows data with different schema, including Schemaless data, to be inserted into the same database, aiding in connectivity.
- Schemaless
- Data that is not structured in rows and columns, e.g. NoSQL databases.
- Segmentation
- Division of a group of customers into subgroups based on shared characteristics in order to target them more precisely in marketing campaigns.
- Second Party Data
- Data on customers obtained from partner companies.
- Software as a Service (SaaS)
- Business software that is hosted in the cloud, rather than being installed locally on a user’s computer.
- Split Testing
- See A/B Testing.
- SQL
- Standardized Query Language, a programming language that can be used to access data in a relational database. Also refers to the specific type of relational database system that can be queried using that language.
- Standard Deviation
- A measure of the amount of variation in a group.
- Statistical Significance
- A strong enough correlation between two or more variables to be confident the correlation is not due to chance.
- Streaming Data
- Data that consists of logs generated dynamically by software activity. In contrast with Bulk Data.
T
- Third Party Data
- Aggregated Customer Data from an external vendor used to enrich a company’s First Party Data for better Segmentation in a Customer Data Platform.
- Treasure ML
- A library for implementing Machine Learning in Treasure Data.
- Treasure Workflow
- A Treasure Data feature that allows complex workflows to be constructed, scheduled and reused.