This project represents 20% of your grade. It is designed so that you can demonstrate your understanding of the topics covered in the course. You will be graded according to the Grading Rubric attached. Read the Rubric, it is a “road map” to what is required. You may work either alone or with a partner. If working with a partner, both names MUST appear on the report.

Time Schedule:

All work is to be completed and turned in by the end of the scheduled Final Exam class period.

April 28th : Pick your Data Set and partner (if applicable), get approval to base your project on that Data Set and group (10% of Project Grade). Present your choice to me by e-mail (barrettb@wcsu.edu) on or before April 28th. Approvals will be given by return e-mail.

Thursday May 12th: Scheduled Final Exam Period (5/12/2022, 8:30-10:00), Projects due the end of the scheduled exam period.

The Project:

You are to write a report based on some Data you have selected. Choose a Data Set with ≥ 100 data points and do an analysis of that data. You are to write as if you are making a presentation of your analysis of the Data Set. In your report you must cover/include/calculate/formulate/mention (get the idea?) the following as related to your Data Set.

Provide a description of the Data in text, numbers and pictorial form (graphs, tables, etc.)

Provide Descriptive Statistics on the Data;

Is the data Normally distributed,

Provide probabilities and/or probability distributions describing the data;

Calculate a Confidence Interval for the mean and one for the chosen Proportion.

Select a segment (sample) of the data, based either on time or other parameter, and do a Hypothesis Test for the mean of the segment against the total data set;

Discuss the relationship between the Confidence Interval and the Hypothesis Test of the mean;

Select TWO segments(samples) of the data either based on time or other parameter, and calculate a Confidence interval for the difference in the means;

Select TWO segments(samples) of the data either based on time or other parameter, and do Hypothesis Test for Two Samples for the mean;

Select TWO segments (samples) of the data calculate a proportion for some characteristic of the data and do a Hypothesis Test for Two Samples for the proportion;

If possible do a Regression Analysis on your data set, if not possible, state why;

For one of the following you are to use both “BootStrap” and traditional formula calculation methods to calculate results. In doing this you are to compare the results achieved by both methods and discuss the differences and similarities of same.

Confidence interval for the difference in means;

Hypothesis test for the difference in means;

Draw some Conclusions from the data based on your analysis of the statistics above (part 12). Support your conclusions with p-values, confidence intervals, tables, and graphs.

Use the tools and formulas as gone over in class. You may Bootstrap your results, Calculate using Formulas, or Both, unless stated otherwise in these instructions.

Finding Your Data:

The usual way is to do a web search for data sets. There are a host of government websites full of tables of data that can be used for your project data set.

Some Agency Acronyms (a quick search will give the agency name and website)

CDC, DEA, DOD, DOE, DOL, DOT, FDA, FEMA, NCHS, NOAA, OSHA, NASA, NCER, NCIC, NCID, NCIRC, NCIS, USDA, USGSYou do not need to limit yourself to US agencies you could also search other country’s agencies.

Note: Some of the Data Sets you may find are actual population numbers so it would not make sense for you to calculate confidence intervals as that would be foolish. (Having population data means you know the exact population mean and do not need a confidence interval) For the purposes of this project you are to treat all data as SAMPLE data.

Organizing Your Data:

I recommend that once you have decided on your data download or enter it into a spreadsheet program, probably Excel, and manipulate/organize into a form you can paste into Statkey (or other program of your choice). Do all your subsets and ordering or filtering in the Spreadsheet program, then paste into the Statistics program you are using.

MAT 120 Final Report – Grading Rubric April 26th .

Name:_______ ___________ Point Value

Category 4 3 2 1

Introduction (Organization) The introduction is inviting, states the main topic and previews the structure of the paper The introduction clearly states the main topic and previews the structure of the paper, but is not particularly inviting The introduction states the main topic but does not adequately preview the structure of the paper nor is it particularly inviting There is no clear introduction of the main topic or structure of the paper

Mathematical Terminology and Notation (Counts 2x’s – that is score for this category is doubled) Correct terminology and notation are always used, making it easy to understand what was done. Correct terminology and notation are usually used, making it fairly easy to understand what was done. Correct terminology and notation are used, but it is sometimes not easy to understand what was done. There is little use, or a lot of inappropriate use, of terminology and notation.

Focus on Topic There is one clear well focused topic. Main idea stands out and is supported by detailed information. Main idea is clear but the supporting information is general. Main idea is somewhat clear but there is a need for more supporting information. The main idea is not clear. There is a seemingly random collection of information.

Completion of all Tasks (Counts 2x’s – that is score for this category is doubled) All problems and Tasks are completed. All but one or two of the problems and Tasks are completed. All but four of the problems or and Tasks are completed. More than four of the problems and Tasks are not completed.

Mathematical Concepts Explanation shows complete understanding of the mathematical concepts used to solve the problem (s). Explanation shows substantial understanding of the mathematical concepts used to solve the problem (s). Explanation shows some understanding of the mathematical concepts needed to solve the problem (s). Explanation shows very limited understanding of the underlying concepts needed to solve the problem (s) OR is not written.

Neatness and Organization The work is presented in a neat, clear, organized fashion that is easy to read. The work is presented in a neat and organized fashion that is usually easy to read. The work is presented in an organized fashion but may be hard to read at times. The work appears sloppy and unorganized. It is hard to know what information goes together.

Mathematical Errors 90-100% of the steps and solutions have no mathematical errors Almost all (85-89%) of the steps and solutions have no mathematical errors Most (75-84%) of the steps and solutions have no mathematical errors More than 75% of the steps and solutions have mathematical errors

Diagrams and Sketches Diagrams and/or sketches are clear and greatly add to the reader’s understanding of the procedure(s). Diagrams and/or sketches are clear and easy to understand. Diagrams and/or sketches are somewhat difficult to understand. Diagrams and/or sketches are difficult to understand or are not used

Support for Topic Completion Relevant, telling, quality details give the reader important information that goes beyond the obvious or predictable. All problems are completed. Supporting details and information are relevant, but one key issue or portion of the storyline is unsupported. All but one of the problems are completed. Supporting details and information are relevant, but several key issues or portions of the storyline are unsupported. All but two of the problems are completed. Supporting details and information are typically unclear or not related to the topic. Several of the problems are not completed.

