What is the key difference between a histogram and a bar chart?
A. Bar charts are used for continuous data
B. Histograms have gaps between bars
C. Histograms represent continuous data, while bar charts represent categorical data
D. Bar charts are always vertical
Duplicate data refers to:
A. Data that is wrongly formatted
B. Data that appears more than once in a dataset
C. Incorrectly processed data
D. Data that is cleaned
The process of documenting assumptions and limitations in data analysis falls under:
A. Data validation
B. Data cleaning
C. Report documentation
D. Data documentation
Metadata in data documentation refers to:
A. Data about data
D. The process of collecting data
C. Final results of data analysis
D. A visual representation of the data
Which of the following is essential for proper data documentation?
A. Variable descriptions and units
B. Styling and formatting choices
C. Graph types used
D. The fonts and colors used in the document
What does a histogram display?
A. Categories of data
B. Frequency distribution of continuous data
C. Proportions of different categories
D. Trends over time
Which of the following statements about the median is true?
A. It is always greater than the mean
B. It divides the data into two equal halves
C. It is affected by extreme values
D. It can only be calculated for numerical data
What is the primary purpose of data documentation?
A. To describe the data collection process
B. To summarize the analysis results
C. To provide definitions and metadata for datasets
D. To create charts and graphs
Which of the following is an example of a regular report?
A. A report generated for a one-time project
B. A monthly sales performance report
C. A special investigation report
D. A report created to answer a specific query
An ad-hoc report is typically generated:
A. On a scheduled basis
B. In response to a specific business question or request
C. For every financial quarter
D. Annually
What does variability measure in a dataset?
A. The most frequent data point
B. The central tendency of data
C. The spread of data
D. The size of the dataset
Which of the following is NOT a type of data?
A. Nominal
B. Ordinal
C. Interval
D. Procedural
In data analysis, which term describes attributes that take on a range of values?
A. Discrete variables
B. Continuous variables
C. Dependent variables
D. Constant variables
When the data is skewed, which measure of central tendency is more reliable?
A. Mean
B. Median
C. Mode
D. None of the above
Which measure of central tendency is best used with nominal data?
A. Mean
B. Median
C. Mode
D. Variance
Which chart type is ideal for displaying relative proportions of categorical data?
A. Line chart
B. Histogram
CC. Pie chart
D. Bar chart
When identifying duplicates, it is important to check for:
A. Formatting errors
B. Misspellings
CC. Exact matches across relevant columns
D. Missing values
The primary difference between ad-hoc reports and regular reports is:
A. Ad-hoc reports are detailed; regular reports are summary-level
B. Ad-hoc reports are generated as needed; regular reports are scheduled
C. Ad-hoc reports are financial; regular reports cover all areas
D. Ad-hoc reports use graphs; regular reports use tables
In a symmetric distribution, how are the mean, median, and mode related?
A. They are equal
B. Mean is always greater than the mode
C. Mode is always smaller than the median
D. Median is greater than the mean
What is data in the context of data analysis?
A. Processed information
B. Raw facts and figures
C. Conclusions drawn from research
D. A summary of findings
Which of these is an example of unstructured data?
A. Data in a relational database
B. A financial statement
C. A social media post
D. An Excel table
Why is it important to remove duplicate data before analysis?
A. It improves the style of the report
B. It reduces file size
C. It ensures accuracy of the analysis
D. It speeds up the cleaning process
In a bar chart, what do the lengths of the bars represent?
A. The frequency of occurrence of data categories
B. The relationship between two variables
C. The total of all data values
D. Trends over time
What is the main characteristic of an ad-hoc report?
A. It is based on real-time data
B. It follows a fixed template
C. It is created on demand
D. It requires approval before distribution
Structured data is typically stored in:
A. Text documents
B. Spreadsheets or databases
C. Images
D. Audio files
In a histogram, what does the height of each bar represent?
A. The cumulative frequency
B. The class midpoint
C. The frequency of the data within a range
D. The average of the data values
What is the purpose of descriptive statistics?
A. To summarize and describe data
B. To infer conclusions from a sample
C. To predict future data trends
D. To test hypotheses
Which measure of central tendency is most affected by outliers?
A. Mean
B. Median
C. Mode
D. Range
Which of the following is NOT true about regular reports?
A. They follow a fixed reporting schedule
B. They contain key performance metrics
C. They provide answers to specific, unexpected queries
D. They are typically distributed to a wide audience
What is the mode in a dataset?
A. The arithmetic average
B. The middle value
C. The most frequently occurring value
D. The highest value
Leave a Reply