## Notes on subject 1: basic Statistical principles

Science scientific research is based on the empirical technique for making observations - for systematically obtaining information. It is composed of techniques for do observations. Observations monitorings are the an easy empirical "stuff" that science. Statistics Statistics is a collection of methods and rules because that organizing, summarizing and also interpreting information. The methods and also rules enable scientific researchers to describe and analyze the observations they have actually made. Statistical techniques are devices for science. Science is composed of techniques for do observations; Statistics consists of approaches for describing and evaluating the observations. here are some of the "observations" we gathered in the inspection we did on the an initial day of course in 1997 and also 1998. ## populations & Samples

Populations A populace is the collection of all people of attention in a particular study. We will additionally refer to populations of scores. Samples A sample is a collection of people selected native a population, commonly intended to stand for the population in a study. We will additionally refer come samples that scores. The data us gathered in course are a "sample" that scores acquired with a sample the individuals. The population we sampled native is the populace of UNC undergraduates. Parameters A Parameter is a value, generally a numerical value, that describes a Population. A Parameter may be obtained from a single measurement, or it might be derived from a set of dimensions from the Population. Statistics A Statistic is a value, usually a number value, that explains a Sample. A Statistic may be obtained from a single measurement, or it might be obtained from a set of dimensions from the Sample. below are part "statistics" computed from our sample that data: Data Data (plural) are dimensions or observations. A data set is a repertoire of measurements or observations. A datum (singular) is a single measurement or observation and is frequently called a data-value, a score, or a raw score. Descriptive Statistics Descriptive Statistics room statistical measures used come summarize, organize and simplify data. The is also the branch the statistical activity focusing on the use of together procedures. These measures are the emphasis of chapters 1 through 5. Statistical Visualization Recently emerged computational statistical procedures used to visually summarize, organize and simplify data. The statistical device we room using is named ViSta for "Visual Statistics", due to the fact that it has statistical visualiation. A statistical visualization of ours data is displayed below. It shows the relationship between GPA and also Satisfaction with the UNC experience. Greater satisfaction is linked with greater GPA. Exploratory Statistics The process of experimenting data by using descriptive and visualization methods to "see what the data seem come say". The branch that statistics that concentrates on "seeing what the data seem come say" (Tukey, 19??). Inferential Statistics Inferential Statistics consist of of methods that enable us to study samples and then to do generalizations around the populaces from which the samples were selected. The is also the branch that statistical activity focusing top top the usage of together procedures. These measures are the emphasis of chapters 8 through the remainder the the text. The groundwork for statistics inference is to adjust in chapters 6 and also 7. Sampling Error Sampling error is the discrepency, or amount of error, that exists between a sample statistic and the corresponding populace parameter.

## The Scientific an approach and the style of Experiments

scientific research attempts to discover orderliness in the world - to uncover regularity in changes. Something that can readjust is dubbed a variable. Variables A variable is a characteristic or problem that transforms or has various values for various individuals. In the data us gathered, the variables include "Gender", "Age", etc. A constant is a characteristic or problem that does not vary, and is the exact same for every individual. The Correlational Method The scientific an approach in which 2 (or more) variables are observed without manipulation (i.e., as they exist naturally) to see if there is any relationship in between them. The correlational technique cannot establish cause-and-effect: Correlation is not causation! The data us gathered are an instance of the correlational method. We deserve to say that "Higher satisfaction is associated with greater GPA", yet we can"t say the "Higher GPA causes greater satisfaction" (or the converse). The experimental Method The scientific technique which can develop a cause-and-effect relationship between two (or more) variables. Some vital points: The researcher manipulates one variable and also observes what happens on the other. An ext than one variable might be manipulated or observed. Come correctly create cause-and-effect, the researcher need to exercise some control over the experimental case to ensure the some other variable(s) do(es) not influence the partnership being watched. Random Assignment have the right to be used to remove other variables" affect on results. The experimental conditions must be identical, other than differing on worths of the manipulated variable. Independent change (also called the predictor variable) The change which is manipulated by the researcher. Dependent variable (also dubbed the solution variable) The variable which is it was observed by the researcher for changes in order to access the effect of the treatment. (The therapy is the manipulation that the predictor variable). Confounding Variable An untreated variable that is unintentionally permitted to vary systematically with the independent variable. Confounds the results (bad, bad, bad!). The control group This is a condition of the elevation variable the does not get the speculative treatment. Usually, the manage group receives either no therapy or a placebo treatment. The experimental group This is a problem of the elevation variable that does obtain an speculative treatment. There may be several experimental groups. The Quasi-Experimental Method Examines differences in between pre-existing teams of subjects (such as guys vs. Women) or differences between groups the scores acquired at different times (before and also after treatment). Hypotheses A hypothesis is a prediction about the outcome of an experiment. In experimental research, a hypothesis makes a prediction around how the manipulation that the elevation (predictor) variable will affect the dependence (response) variable.

## Measurement

Data are dimensions of observations which indicate categorizing, notified or using number to characterize amount. Numerous levels of measurement room involved. This in turn recognize what statistics can be computed. Measurements may also be discrete or continuous.

### scale (Levels) of measure up

Nominal The nominal level of measure up labels monitorings so that they loss into various categories. Football jersey numbers and also home street addresses are common examples. In ViSta, in the name of variables are dubbed "Category" variables. Ordinal The ordinal level of measurement consists of categories that room ordered in a sequence. Bespeak of end up in a race is a common example. In ViSta, ordinal variables are dubbed "Ordinal" variables. Interval The expression level the measurement consists of ordered category where all of the categories space intervals of precisely the very same size. Temperature is a common example. Here, equal differences between numbers reflect equal differences in magnitude of the it was observed variable. Ratio The proportion level of measure is an interval scale with an absolute zero point. Length and weight are common examples. Here, proportion of numbers reflect ratios of variable magnitude. In ViSta, interval and also ratio variables are referred to as "Numeric" variables.

### Discrete and continuous Variables

Discrete A discrete variable has separate, indivisible categories. No values deserve to exist in in between two bordering categories. Continuous A continuous variable has an infinite variety of possible values falling between any two it was observed values.

## Mathematical Notation

In statistics calculations you will constantly be compelled to add a collection of worths to find a details total. We use algebraic expressions to represent the worths being added. For example X method "Scores top top a Variable. For example X = <1 2 3> describes a variable with three observations which room 1, 2, and also 3." us will use the greek letter Sigma to denote the summation process. Thus, we write keep in mind that every calculations in ~ parentheses room done first. Squaring, multiplying, and also dividing space done second, and also should be completed in order indigenous left to right.

Including and individually (including summation) space third, and should be completed in order from left to right. The complying with term, which is referred to as the "squared sum"
functions as shown: due to the fact that of the bespeak of operations, the complying with term, i m sorry is dubbed "the sum of squares"
, functions as shown:  consider how the following summation equation works: top top the various other hand, the next summation equation functions differently:  Finally, consider how this critical summation equation works: 