Statisticss Coursework Essay, Research Paper

Investigation into some of the statistical differences between The Times and The Telegraph on a specific twenty-four hours

Design and Planning

The purpose of this undertaking is to compare two day-to-day published circulars. The two documents that will be used are THE TIMES and THE TELEGRAPH, both purchased on the same twenty-four hours. A batch of informations can be easy collected from a newspaper, runing from mean word length to country devoted to adverts per page.

The undertaking will try to make decisions sing three specific inquiries. In replying these inquiries a scope of trying methods, presentation of informations, and statistical computations will be used in order to construe and measure the informations and come to a valid decision, pulling together all the informations.

Each inquiry will be presented and it will be explained what statistical methods will be involved in pulling decisions for these inquiries.

Question 1:

+ How does the font size of the headline text affect the length of the article?

This involves comparing two sets of informations:

+ Font Size of Headline text: A sheet was printed from Microsoft Word that had assorted font sizes in the Times New Roman fount, the standard fount for the two documents, printed on it. This was used as a guideline when roll uping all the informations.

+ Length of column of each article: In The Times and The Telegraph there is a standard column breadth and merely mensurating the perpendicular length of all the columns in the article gives a appropriately accurate indicant of the length of the article

To do any computations accurate adequate to pull a valid decision at least 20 sets of informations from each paper will necessitate to be collected. As each page has about three articles on it and both newspapers have approximately 30 pages as systematic sample of every 4 pages will supply plenty informations to back up any decision.

The best ways to happen out if the size of the headline text affects the length of the article is to pull a spread diagram and happen the line of best tantrum and to utilize Spearman s rank correlativity coefficient.

Question 2:

+ What is the most common type of advertizement and how much infinite is given to each? Compare how this differs in the two newspapers.

This involves roll uping two sets of informations:

+ Number of times a pre-defined type of advert occurs: This will be done merely by looking through the paper and doing a tally chart.

+ Area devoted to each pre-defined advert type: Whilst doing the tally chart the country of each advert will besides be recorded in centimetres squared. All these consequences can so be added up to give the entire country devoted to each advert type.

To do any computations accurate adequate to pull a valid decision at least 20 sets of informations from each paper will necessitate to be collected. The lone just manner to make this is to roll up informations from the whole of both documents, as this gives a much better image of how much advert infinite there is and will supply at least 20 sets of informations from each paper.

The best manner to compare the informations collected is to pull two sets of comparative pie charts. One set comparing the type of advert and the other comparing the country devoted to each type.

Question 3:

+ What is the scattering and norms of the figure of words in each article and how do they differ between the two newspapers?

This involves roll uping one set of informations:

+ The figure of words: This will be done by numbering the figure of words in the first sentence as this normally gives a good indicant of the deepness of the article. The information will be collected in a sorted frequence tabular array.

To do any computations accurate adequate to pull a valid decision at least 20 sets of informations from each paper will necessitate to be collected. Therefore to roll up the right sum of informations 50s samples in sum over the two documents should be taken in the manner of a graded random sample, administering the sum of samples proportionately between the two documents. A page figure should so be indiscriminately generated and the first article from that page sampled.

The best manner to compare these two sets of informations will be to utilize standard divergence, average divergence, the quartile ranges, the norms ( mean, average, manner ) , and histograms with box and hair’s-breadth diagrams.

Collection, choice, presentation, analysis and reading and rating of informations

Question 1:

To do the computations accurate adequate to pull a valid decision 20 sets of informations from each paper was collected. As each page has about three articles on it and both newspapers have approximately 30 pages as systematic sample of every 4 pages was used to supply plenty informations to back up any decision. As with all uninterrupted informations the column length will hold a maximal and minimal mistake which will intend that mistakes in the informations are possible, nevertheless these mistakes will non perceptibly affect any of the statistical computations.

THE TIMES THE TELEGRAPH

Font Size Column Length ( centimeter ) Font Size Column Length ( centimeter )

72 57 90 45

36 11 48 20

24 16 36 17

48 59 48 20

72 68 72 46

36 20 20 5

28 20 36 30

72 34 72 75

36 14 30 24

80 36 28 16

72 49 48 42

36 21 24 6

28 20 36 22

48 35 72 38

36 18 28 14

90 83 72 34

24 8 90 80

60 46 28 19

36 18 90 104

72 34 72 67

It was found that non every page had 3 articles on it so non as many samples were collected as was hoped, but fortunately 20 samples were still collected anyhow.

Scatter diagrams

First a spread diagram was drawn for each of the two sets of informations. This consists of puting out each of the steps along one of the axes of the grid, so sing each point in bend. The two steps for that point act precisely like an ordered brace and therefore like co-ordinates of a point on the grid. Each point considered is thereby linked to one point on the grid and that point can be plotted in the normal manner. From the spread of points that is built up a form can be identified, a line of best tantrum simplifies this tendency. To plot this line a particular point was plotted, ( Average of Font Size, Average of Column Length ) . These were so compared.

Spearmans Rank Correlation Coefficient

Another method of happening the relationship between two sets of informations is to utilize Spearmans Rank Correlation Coefficient. Each distribution must foremost be put into an order of virtue. Each point being considered has two ranks allocated to it and the difference between these two ranks can be found. If the symbol vitamin D is used to stand for this difference so the coefficient of rank correlativity can be written as:

where N is the figure of points in the distribution.

If two or more steps in one distribution are equal it is convenient, though non mathematically justifiable, to apportion them a rank which is the norm of the ranks which they would hold occupied if they had been different. For illustration, if the 3rd and 4th steps in a distribution are equal they would both be allocated the rank 3.5 or if the fifth, 6th, and seventh are equal they would be allocated the rank 6.

The easiest manner to stand for this information and to cipher the correlativity is to set the information in a tabular signifier.

The Times

Font Size Column Length ( centimeter ) RankFont Size Rank Column Length vitamin D

72 57 5 4 1 1

36 11 13.5 19 -5.5 30.25

24 16 19.5 17 2.5 6.25

48 59 9.5 3 -7.5 56.25

72 68 5 2 3 9

36 20 13.5 13 0.5 0.25

28 20 17.5 13 4.5 20.52

72 34 5 9.5 -4.5 20.25

36 14 13.5 18 -4.5 20.25

80 36 2 7 -5 25

72 49 5 5 0 0

36 21 13.5 11 2.5 6.25

28 20 17.5 13 4.5 20.25

48 35 9.5 8 1.5 2.25

36 18 13.5 15.5 -2 4

90 83 1 1 0 0

24 8 19.5 20 -0.5 0.25

60 46 8 6 2 4

36 18 13.5 15.5 -2 4

72 34 5 9.5 -4.5 20.25

n=20 = 250

By using the above equation the undermentioned computations provide a measuring of the relationship between the two distributions:

The Telegraph

Font Size Column Length ( centimeter ) RankFont Size Rank Column Length vitamin D

90 45 2 6 -4 16

48 20 9 13.5 -4.5 20.25

36 17 13 16 -3 9

48 20 9 13.5 -4.5 20.25

72 46 6 5 1 1

20 5 20 20 0 0

36 30 13 10 3 9

72 75 6 3 3 9

30 24 15 11 4 16

28 16 17 17 0 0

48 42 9 7 2 4

24 6 19 19 0 0

36 22 13 12 1 1

72 38 6 8 -2 4

28 14 17 18 -1 1

72 34 6 9 -3 9

90 80 2 2 0 0

28 19 17 15 2 4

90 104 2 1 1 1

72 67 6 4 2 4

n=20 = 128.5

By using the above equation the undermentioned computations provide a measuring of the relationship between the two distributions:

Interpretation and rating:

Scatter diagrams:

On the spread diagrams a line of best tantrum was drawn, go throughing through

The distribution of the points plotted on the spread diagrams can give an indicant of the relation between the two features being measured. The lines of best fit both follows a consecutive line and this shows that that the steps are straight relative. The lines on both diagrams both have similar angles, approximately 45. , this shows that the relationship between headline size and article length is really strong in both newspapers. Despite the good lines of best tantrum, on both diagrams as the headline size and article length addition the points deviate farther from the line of best tantrum. This may be show that, although the article length does increase as the headline size additions, as the values get higher there is less of a strong relationship between the two steps as there is when both of them are little. This means that as one value increases so does the other but it may increases by more or less proportionately to its original size.

Spearmans Rank Correlation Coefficient:

Despite cognizing that both diagrams reveal a strong correlativity there is no easy manner of cognizing which 1 has the strongest correlativity, nor does it supply a measuring of how closely these steps approximate to the lines of best tantrum. The type of step that is used for this intent is called the coefficient of correlativity and it is assessed on a graduated table which runs from +1 through nothing to -1. A coefficient of correlativity of +1 agencies that the two distributions match each other absolutely and this would match to a spread diagram where all of the points plotted lie along the taking diagonal of the grid. A coefficient of correlativity of -1 would match to a brace of distributions where the steps are in wholly the opposite order, that is, the first in one distribution is last in the other and so on.

As with the spread diagrams Spearmans Rank shows that both newspapers have a strong relationship between headline size and article length. But it reveals that this relationship is stronger in The Telegraph than it is in The Times. However Spearmans Rank can be delusory as it merely considers the rank of the distributions non the existent value that the spread diagram does.

To reason, both steps show that there is a strong relationship between the size of the fount of the headline text and the length of the article. This makes logical sense and would moderately be expected in both newspapers. However there is no peculiar ground that The Telegraph should hold a stronger relationship than The Times and this may be merely what the documents were like on that specific twenty-four hours.

Question 2:

To do any computations accurate adequate to pull a valid decision at least 20 sets of informations from each paper was needed. The lone just manner to make this was to roll up informations from the whole of both documents, as this gives a much better image of how much advert infinite there is and provides at least 20 sets of informations from each paper.

The Times The Telegraph

Advert Type Area ( ) Advert Type Area ( )

Holiday 442 Holiday 170

Computer 468 Holiday 400

Alcohol 493 Holiday 425

Car 2088 Telephone 672

Computer 775 Bank/Insurance/ Money 250

Bank/Insurance/ Money 408 Holiday 250

Holiday 12 Computer 858

Bank/Insurance/ Money 170 Car 2088

Car 988 Bank/Insurance/ Money 1015

Holiday 544 Electrical Appliances 2088

Manner 918 Car 950

Electrical Appliances 825 Education 160

Telephone 116 Furniture 2088

Car 2088 Computer 2052

Electrical Appliances 825 Car 540

Car 900 Vacation 168

Computer 2088 Computer 832

Car 412.5 Vacation 544

Car 928 Car 2088

Books 400 Bank/Insurance/ Money 450

Computer 400 Bank/Insurance/ Money 450

Holiday 425 Computer 881

Cinema 912.5 Education 425

Bank/Insurance/ Money 280 Phone 180

Car 240

Computer 693

Bank/Insurance/ Money 476

Bank/Insurance/ Money 425

Both newspapers had more than adequate adverts within them to back up any valid decisions. Despite the fact that The Telegraph has four more adverts in it than The Times this will non impact any statistical computations.

First two tabular arraies were drawn up, one to demo the frequence of the type of adverts and the other to demo the country devoted to each specific type of advert. From these two tabular arraies two sets of comparative pie charts were drawn. One comparing the type of adverts in The Times and The Telegraph and the other comparing the country devoted to each of these type of adverts in The Times and The Telegraph.

Comparative pie charts allow you to compare non merely the per centum constituents but besides the sums of the constituents, the countries of the pie charts must be relative to the sums of the constituents.

Type of Advert

Advert Frequency

Type Times ( ) Telegraph ( )

Holiday 4 6

Computer 4 5

Car 6 5

Bank/Insurance/Money 3 6

Telephone 1 2

Alcohol 1 0

Manner 1 0

Electrical Appliances 2 1

Book 1 0

Cinema 1 0

Education 0 2

Furniture 0 1

=24 =28

Leting, be the radii of the pie charts to stand for The Times and The Telegraph, so if peers 4cm so:

=

The angles in the pie chart that will stand for each type of advert can be calculated by:

Dividing N by N and multiplying by 360.

e.g. Vacation in The Times & # 8211 ;

Area devoted to each advert type

Advert Area ( )

Type Times ( ) Telegraph ( )

Vacation 1423 1957

Computer 3731 5316

Car 6476.5 5906

Bank/Insurance/Money 858 3066

Telephone 116 852

Alcohol 493 0

Manner 918 0

Electrical Appliances 1650 2088

Book 400 0

Cinema 912.5 0

Education 0 585

Furniture 0 2088

=16978 =21858

Leting, be the radii of the pie charts to stand for The Times and The Telegraph, so if peers 4cm so:

=

The angles in the pie chart that will stand for each type of advert can be calculated by:

Dividing N by N and multiplying by 360.

e.g. Car in The Telegraph & # 8211 ;

Interpretation and rating:

Type

It can clearly be seen from the informations that The Telegraph has four more adverts than The Times. In The Times it can be seen that the adverts for autos are the most frequent, whereas in The Telegraph adverts for vacations and for bank/insurance/money are the most common. Holiday, computing machine, auto and bank/insurance/money adverts are the four most common type of adverts in both documents, with the other classs merely happening one time or twice. This could be expected, sing the type of newspapers that are being sampled. The Times and The Telegraph have a certain type of reader and these adverts are evidently aimed specifically at these readers. Besides the four most common adverts are publicizing products/services that involve the most sum of money, therefore it is plausible that it is more profitable for the paper to publicize these type of adverts as competition will lift the monetary value of advertisement.

What the comparative pie charts allow you to make is to compare the per centum of the entire adverts each advert type represents. The charts show that if a certain advert has an equal frequence in The Times and The Telegraph it has a higher per centum of the sum in The Times than The Telegraph. This is shown clearly by the fact that the auto adverts in The Timess take up a higher per centum of the sum adverts than the vacation and bank/insurance/money adverts do in The Telegraph despite them holding the same frequence. It is besides deserving detecting that The Times has a wider scope of adverts than The Telegraph. In both instances the four most common adverts take up approximately three quarters of the chart which once more shows the readers the documents are aimed at and that the size of the market for these adverts is larger than the remainder.

The less frequent adverts can be affected by the contents of the newspapers on that twenty-four hours, which may explicate why adverts in one paper do non happen in the other. It is besides possible that these adverts may non hold such a big market with the readers or that big sums of advertisement is non economically feasible.

Area

Looking at the comparative pie charts for the country of the adverts for each type it reveals that despite certain advert types happening often they do non needfully cover a big country. This is shown by the vacation and electrical contraptions classs in The Telegraph vacation represents 21.4 % of the type of adverts whilst merely covering 9 % of the entire country devoted to adverts, whereas electrical contraptions represents merely 3.6 % of the type of adverts whilst it covers 9.6 % of the entire country devoted to adverts. This may be that certain types of adverts do non happen often but necessitate more infinite while some frequent adverts don t need a batch of infinites. The auto class covers the largest country in both The Times