SCIENCE!!! (Read 126952 times)

Oldmanmatt · April 14, 2014, 09:06:34 am

No, splitting is not new. We have been generating O2 for use on Submarines, using electrolysis, for decades. This talks of a Catalytic reaction and the generation of Hydrocarbons rather than an energy intense Electrolysis. So not a green fuel like pure H (though surely less S, so reduced SOX if not NOX)?
Still waiting for a paper to read, if you stumble across one. This was thrown up by an old friend as the first hint of something concrete about a rumour which has been kicking around the Maritime world for a couple months.

One of the limiting factors (as it is now with Electrolysis and even just steam production) will be removal of impurities (salts) from the sea water. At present done by Evaporation or Reverse Osmosis, in the main and both energy intensive.
Given the efficiency mentioned in the article, the water must be purified prior to reaction (?).

slackline · April 14, 2014, 09:54:13 am

Quote from: Yoof on April 13, 2014, 10:21:52 pm

More stats love (or disappointment)

http://www.nature.com/news/weak-statistical-standards-implicated-in-scientific-irreproducibility-1.14131

I read Johnoston's paper that this is highlighting the other month when it came out. The main problem for me is that its based on something called Bayes Factor (and he back-translates the significance threshold to frequentist p-values) which traditionally are used to for comparing different models rather than hypothesis testing in and of itself. If you're applying it to hypothesis testing then it essentially reduces to the ratio of the p-value under the Null Hypothesis (H₀) to the p-value under the Alternative Hypothesis (H₁). Quite how you derive the p-values under the alternative hypothesis is unclear to me, and is just as baffling as to how people actually choose meaningful prior probabilities for Bayesian analysis in the first place.

What is a big problem is the mistaken belief that anything with a p < 0.05 is "statistically significant".

This stems from some comments made by the eminent statistican (and geneticist) R.A. Fisher in a 1926 research paper and then again in his popular book Statistlca Methods for Research Workers....

From RA Fisher The Arrangement of Field Experiments The Journal of the Ministry of Agriculture (1926) 33:504...

Quote

If one in twenty does not seem high enough odds, we may, if we prefer it, draw the line at one in fifty (the 2 per cent. point), or one in a hundred (the 1 per cent. point). Personally, the writer prefers to set a low standard of significance at the 5 per cent. point, and ignore entirely all results which fail to reach this level. A scientific fact should be regarded as experimentally established only if a properly designed experiment \textbf{rarely} fails to give this level of significance.

From RA Fisher The Design of Experiments (1935) p 13...

Quote

It is usual and convenient for experimenters to take 5 per cent. as a standard level of significance, in the sense that they are prepared to ignore all results which fail to reach this standard, and, by this means, to eliminate from further discussion the greater part of the fluctuations which chance causes have introduced into their experimental results.

And for whatever reason (perhaps simplicity and not having to actually think about the results and the body of evidence) this has become the de-facto threshold for declaring "statistical significance" in many areas of scientific research. But this was never how Fisher meant for p-values to be used, they were only ever one part of the body of evidence to support a hypothesis because a few pages further into The Design of Experiments (p16) he wrote...

Quote

No isolated experiment, however significant in itself, can suffice for the experimental demonstration of any natural phenomenon; for the 'one chance in a million' will undoubtedly occur, with no less and no more than its appropriate frequency, however surprised we may be that it should occur to us.

But that seems to often get ignored as many people don't have statistical training, but instead prefer nice clear simple rules so that they don't have to think too much about it (viz. a threshold for stating someone has "high" blood pressure and should be on drugs to control them). But you can't ever escape having to look at the evidence yourself and think about what the data from all areas are showing you..

From RA Fisher The Design of Experiments (1935)p2

Quote

The statistician cannot excuse himself from the duty of getting his head clear on the principles of scientific inference, but equally no other thinking man can avoid a like obligation.

From RA Fisher Statistical methods and scientific induction. (1955) J Roy Stat Soc B 17:69-78

Quote

We have the duty of formulating, of summarising, and of communicating our conclusions, in intelligible form, in recognition of the right of other free minds to utilize them in making their own decisions.

Given a large enough sample size any small difference can be statistically significant, but is it an actual (clinically) meaningful difference. For example you might be able to demonstrate that a drug can lower blood pressure by 0.1mmHg, but thats not really a clinically relevant difference.

Some more reading on the area if anyone is interested is this easy read (also in Nature, although not sure if its pay-walled) Scientific method: Statistical errors and also the articles discussing the science wide false-discovery rate linked from this blog

SA Chris · April 14, 2014, 10:22:12 am

Bouldering is good for your bones

Obi-Wan is lost... · April 14, 2014, 10:41:35 am

Quote from: SA Chris on April 14, 2014, 10:22:12 am

Bouldering is good for your bones

Was speaking recently to someone who worked in that area of medicine and they were saying they loved the current popularity of ~~trampolines~~ bounce-areens*, as you'll get a generation of people growing up with strong bones.

*Mini-Obi No.2 calls trampolines, bounce-areens and it's such a more suitable word we've adopted it.

SA Chris · April 14, 2014, 11:00:46 am

Not great for under 5s though!

http://www.bbc.co.uk/news/health-19713691

more than 1 child on a trampoline = tears at some point.

erm, sam · April 14, 2014, 11:44:27 am

Quote

Some injuries may even be fatal - failed attempts at somersaults and flips frequently cause cervical spine injuries, resulting in permanent and devastating consequences, says the AAP.

Frequently my arse. Very very occasionlly would be a more accurate description I am sure.

slackline · April 14, 2014, 12:09:57 pm

Quote from: erm, sam on April 14, 2014, 11:44:27 am

Quote
Some injuries may even be fatal - failed attempts at somersaults and flips frequently cause cervical spine injuries, resulting in permanent and devastating consequences, says the AAP.

Frequently my arse. Very very occasionlly would be a more accurate description I am sure.

I disagree I expect that failed attempts do often result in injury particularly on home trampolines where there is no coaching as to how to perform them, nor crash mat being pushed in to cushion the landing when learning*, but there are many who successfully execute the somersault/flip.

* I used to do trampolining from about 14 through to 17, our school team even won the national schools championships (not that impressive really). I did have to spot someone when in a competition once where they were starting their routine with a triple somersault with two twists on exit, before starting he told me to get out of the way if he was coming off as there was little I'd be able to do to stop him. I've also seen people take falls onto the hard concrete floor of sports halls, not pretty (fractured hip and broken forearm).

Yoof · April 14, 2014, 12:14:39 pm

Quote from: slackline on April 14, 2014, 09:54:13 am

Quote from: Yoof on April 13, 2014, 10:21:52 pm
More stats love (or disappointment)

http://www.nature.com/news/weak-statistical-standards-implicated-in-scientific-irreproducibility-1.14131

I read Johnoston's paper that this is highlighting the other month when it came out. The main problem for me is that its based on something called Bayes Factor (and he back-translates the significance threshold to frequentist p-values) which traditionally are used to for comparing different models rather than hypothesis testing in and of itself. If you're applying it to hypothesis testing then it essentially reduces to the ratio of the p-value under the Null Hypothesis (H₀) to the p-value under the Alternative Hypothesis (H₁). Quite how you derive the p-values under the alternative hypothesis is unclear to me, and is just as baffling as to how people actually choose meaningful prior probabilities for Bayesian analysis in the first place.

etc.

I'm just getting into (bio)statistics, and currently have a bit of a thing for R, so I'm pretty curious as to whether or not the data I'm testing actually mean anything, and if the alpha we choose means literally nothing it's a bit disappointing. What the two nature papers seem to be suggesting to me is that we should either use more stringent p values for testing hypotheses, or that we should shift towards very carefully applied Bayesian. I'm pretty excited to test out the Bayesian First Aid package at some point to see how the techniques compare.

A quote from one of my lecturers

"I once met someone who thought they had seen some Bayesian statistics. Or it could have been a dunnock"

What do you reckon to Bayesian stats?

SA Chris · April 14, 2014, 12:19:58 pm

I tried to do a front flip on a trampoline in my teenage years on a trampoline without padding and smacked both my heels on the steel frame, my ankles swelled up so badly I couldn't walk for 3 days.

slackline · April 14, 2014, 12:54:09 pm

Quote from: Yoof on April 14, 2014, 12:14:39 pm

I'm just getting into (bio)statistics, and currently have a bit of a thing for R

R is brilliant, get to know Hadley Wickham's tools (dplyr, reshape2, ggplot2 etc.) to ease the learning curve and make life a lot simpler.

Quote from: Yoof on April 14, 2014, 12:14:39 pm

I'm pretty curious as to whether or not the data I'm testing actually mean anything, and if the alpha we choose means literally nothing it's a bit disappointing.

P-values are not meaingless, they tell you the probability of obtaining the observed results if the null hypothesis is true.

Quote from: Yoof on April 14, 2014, 12:14:39 pm

What the two nature papers seem to be suggesting to me is that we should either use more stringent p values for testing hypotheses, or that we should shift towards very carefully applied Bayesian.

What the articles are saying is that p-values are misused by researchers. There's a big difference between that and saying that p-values are meaningless.

Quote from: Yoof on April 14, 2014, 12:14:39 pm

I'm pretty excited to test out the Bayesian First Aid package at some point to see how the techniques compare.

Not heard of the Bayesian First Aid package yet (although you can no doubt install it using Wickham's devtools which make installing R packages hosted on Github a piece of piss). It sounds as though they are a way of calling Bayesian analysis routines that already exist using the equations/structure/language of regression modelling. No doubt a luadable goal and one that would be used by many, but I'm always wary of making statistical analysis "easy" because it just leads to people using it without understanding what it is that they are doing (are the assumptions of the tests satisfied?) and misinterpreting the results.

Quote from: Yoof on April 14, 2014, 12:14:39 pm

A quote from one of my lecturers

"I once met someone who thought they had seen some Bayesian statistics. Or it could have been a dunnock"

What do you reckon to Bayesian stats?

They're one of many analytical tools that has their place in the arsenal of a statistician. I don't like the Frequentist v's Bayesian "debate" as its not a black and white situation. If you want to get more involved in using it consider looking up JAGS and/or OpenBUGS in R.

A good example of modelling gone wrong is Google Flu Trends which used a "BIG DATA" approach to predicting flu out-breaks based on people's search terms and was initially very successful, however in subsequent years it failed because its based purely on patterns in data rather than what is driving the patterns (also analysed in this Science article).

Whether you use a Bayesian or Frequentist approach is somewhat secondary, get enough data and the "answer" will be there. Whats important though is to choose your hypothesis based on existing knowledge and causality rather than just trawling through "BIG DATA"* looking for patterns.

You might find the following free PDF books of interest...

An Introduction to Statistical Learning
The Elements of Statistical Learning

Both by eminent statisticians, the second treats the subject matter in greater detail than the first.

Good luck with the biostatistics courses

* I can't stand the current buzz around "BIG DATA" these quotes pretty much sum its current state of affairs for me...

Quote

@stephensenn: Teenage sex is like #BigData. You fumble around for ages, don't get as far as you hoped, but claim the earth moved.

Quote

@MikeKSmith: #BigData analysis is like teenage sex: everyone SAYS they're doing it, few are, and if they are then it's not as great as they say it is.

erm, sam · April 14, 2014, 01:11:50 pm

Quote

I disagree I expect that failed attempts do often result in injury particularly on home trampolines where there is no coaching as to how to perform them, nor crash mat being pushed in to cushion the landing when learning*, but there are many who successfully execute the somersault/flip.

How can the authors of this report/paper/what ever, have any clue as to how many failed attempts there are at back flips that didn't result in injury. Did they do a questionaire for all familys with trampolines in the USA asking how many times any children had attempted a backflip and then related this to how many admittances there were in hospital caused by same?

I am sure it would have been more accurate to have said " failed backflips that result in a trip to hospital frequently result in very serious injuries"

Jerry Morefat · April 14, 2014, 01:28:11 pm

Quote from: slackline on April 14, 2014, 12:54:09 pm

A good example of modelling gone wrong is Google Flu Trends which used a "BIG DATA" approach to predicting flu out-breaks based on people's search terms and was initially very successful, however in subsequent years it failed because its based purely on patterns in data rather than what is driving the patterns (also analysed in this Science article).

I think you're being a bit unfair here. The Google model didn't fail, it just didn't perform as well as well as a model underpinned by US laboratory surveillance reports. Nor would you expect it to. I wouldn't have thought the authors of the Google paper would claim that their model is the most accurate of all models. All I expect they were interested in was demonstrating that it is possible to build a decent flu prediction model based on a small subset of Google search term data, essentially the set of search terms ('Influenza Complication', 'cold/flu remedy' etc) which were correlated with real instances of the flu.

slackline · April 14, 2014, 01:40:41 pm

Quote from: Jerry Morefat on April 14, 2014, 01:28:11 pm

Quote from: slackline on April 14, 2014, 12:54:09 pm
A good example of modelling gone wrong is Google Flu Trends which used a "BIG DATA" approach to predicting flu out-breaks based on people's search terms and was initially very successful, however in subsequent years it failed because its based purely on patterns in data rather than what is driving the patterns (also analysed in this Science article).

I think you're being a bit unfair here. The Google model didn't fail, it just didn't perform as well as well as a model underpinned by US laboratory surveillance reports. Nor would you expect it to. I wouldn't have thought the authors of the Google paper would claim that their model is the most accurate of all models. All I expect they were interested in was demonstrating that it is possible to build a decent flu prediction model based on a small subset of Google search term data, essentially the set of search terms ('Influenza Complication', 'cold/flu remedy' etc) which were correlated with real instances of the flu.

Yes I use the term "fail" loosely in that the predictions made by the model didn't match what happened in reality (they over estimated by 50% the number of influenza cases in the winter of 2013 which I wouldn't really consider "a decent flu prediction model" no matter what its based on). But this emphasises the problem with trying to mine "BIG DATA" for patterns and then using it to make predictions. The authors of Google Flu Trends actually worked with the Centre for Disease Control (CDC) which is the group who monitor disease reporting. It also demonstrates that difference between a medical diagnosis and someone using a related search term, because whilst there might be correlation between the two it may not be that strong and the later is therefore a poor proxy for the former.

I just stumbled across this in my lunch hour.... Pseudo-Mathematics and Financial Charlatanism: The Effects of Backtest Overfitting on Out-of-Sample Performance in the American Mathematical Society's "Notice" journal which demonstrates the same point.

Quote from: erm, sam on April 14, 2014, 01:11:50 pm

I am sure it would have been more accurate to have said " failed backflips that result in a trip to hospital frequently result in very serious injuries"

Exactly, poor journalism.

SA Chris · April 14, 2014, 02:36:16 pm

My God, what have I done!

Jerry Morefat · April 14, 2014, 02:38:15 pm

Quote from: slackline on April 14, 2014, 01:40:41 pm

Yes I use the term "fail" loosely in that the predictions made by the model didn't match what happened in reality (they over estimated by 50% the number of influenza cases in the winter of 2013 which I wouldn't really consider "a decent flu prediction model" no matter what its based on).

Well it depends how you measure these things I guess! Granted it performed poorly for 2013, but performed very well in previous years and, on average, performs well.

Quote from: slackline on April 14, 2014, 01:40:41 pm

But this emphasises the problem with trying to mine "BIG DATA" for patterns and then using it to make predictions.

I just stumbled across this in my lunch hour.... Pseudo-Mathematics and Financial Charlatanism: The Effects of Backtest Overfitting on Out-of-Sample Performance in the American Mathematical Society's "Notice" journal which demonstrates the same point.

I'm not sure it does demonstrate your point. There is nothing wrong with using 'patterns in the data' (the big data approach) as a basis for a statistical model per say. The problem is with people not being careful when it comes to model validation (from a quick read, this is what the AMS article is saying).

In the Google paper the model is trained on data from 2003-2007 and then tested on data from 2007-2008. The problem with this, and one of the probable reasons why the Google trend model hasn't done so well of late, is that this isn't a particularly good way to validate a model. The test set is so small (42 data points) and biased (only 2007-2008) that the test set error is likely to be a poor estimate of the true model error.

slackline · April 14, 2014, 03:32:28 pm

Quote from: Jerry Morefat on April 14, 2014, 02:38:15 pm

Quote from: slackline on April 14, 2014, 01:40:41 pm

Yes I use the term "fail" loosely in that the predictions made by the model didn't match what happened in reality (they over estimated by 50% the number of influenza cases in the winter of 2013 which I wouldn't really consider "a decent flu prediction model" no matter what its based on).

Well it depends how you measure these things I guess! Granted it performed poorly for 2013, but performed very well in previous years and, on average, performs well. I take your point though.

Quote from: slackline on April 14, 2014, 01:40:41 pm
But this emphasises the problem with trying to mine "BIG DATA" for patterns and then using it to make predictions.

I just stumbled across this in my lunch hour.... Pseudo-Mathematics and Financial Charlatanism: The Effects of Backtest Overfitting on Out-of-Sample Performance in the American Mathematical Society's "Notice" journal which demonstrates the same point.
I'm not sure it does demonstrate your point. There is nothing wrong with using 'patterns in the data' (the big data approach) as a basis for a statistical model per say. The problem is with people not being careful when it comes to model validation (from a quick read, this is what the AMS article is saying).

In the Google paper the model is trained on data from 2003-2007 and then tested on data from 2007-2008. The problem with this, and one of the probable reasons why the Google trend model hasn't done so well of late, is that this isn't a particularly good way to validate a model. The test set is so small (42 data points) that the test set error is likely to be a poor estimate of the true model error.

Both are poorly validated and therefore make unreliable predictions.

It wasn't just 50% over estimation by Google Flu Trends in winter 2012-13 either, they were off by a similar amount the previous year too.

I don't have a problem with using patterns in data as a basis of a statistical model, the size of the training data set just means the standard errors will be smaller. You can also do all sorts of other validation approaches such as k-fold cross validation or leave one out cross-validation. Models can be useful*, but you need to have something other than statistical correlation on which to base/design your model otherwise you end up drawing nonce sense conclusions from things like this...

I'm not really knocking "BIG DATA" in and of itself but it has the potential to be grossly misused if people jump on the bandwagon and dredge through large amounts of data hoping to find something "interesting". The "interesting" part should come first and the analysis should then follow, be correctly validated and not be used to extrapolate too far beyond the data range itself. Its just statistical analysis with more data than has traditionally been used in the past (which is why I don't like the buzz-pharse). This is in essence what the Science article I linked to previously is saying.

*

Quote from: George Box

...essentially, all models are wrong, but some are useful...

lagerstarfish · April 17, 2014, 08:15:22 pm

http://www.thedailybeast.com/articles/2014/04/17/no-weed-won-t-rot-your-brain.html

Tim Heaton · April 20, 2014, 12:51:02 am

Quote from: slackline on April 14, 2014, 12:54:09 pm

P-values are not meaingless, they tell you the probability of obtaining the observed results if the null hypothesis is true.

I'm afraid they don't tell you this at all. A p-value tells you the probability of observing data at least as extreme (as judged by one's chosen test statistic) than what you did see if the null hypothesis were true. It's

P(T > t | H_0 ),

and definitely not

P(X = x | H_0).

The latter will always be 0 for continuous random variables. This is a really important difference and a common misconception.

Oldmanmatt · April 21, 2014, 09:45:16 pm

While everyone is in the mood for stats and possibly (?) an example of correlation indicating causation...

http://www.bbc.co.uk/news/magazine-27067615

Thoughts?

slackline · April 29, 2014, 08:47:39 am

Quote from: Tim Heaton on April 20, 2014, 12:51:02 am

Quote from: slackline on April 14, 2014, 12:54:09 pm

P-values are not meaingless, they tell you the probability of obtaining the observed results if the null hypothesis is true.

I'm afraid they don't tell you this at all. A p-value tells you the probability of observing data at least as extreme (as judged by one's chosen test statistic) than what you did see if the null hypothesis were true. It's

P(T > t | H_0 ),

and definitely not

P(X = x | H_0).

The latter will always be 0 for continuous random variables. This is a really important difference and a common misconception.

Thats for correcting my rushed post, I had a 13:00 meeting to get to.

Choropleth maps of health & risk in the UK

slackline · April 30, 2014, 09:28:40 am

Or not

Details the poor standards in some psychology/social science researchers/practitioners and journals that publish their work (not exclusive to those areas but one where it is rife).

simes · May 10, 2014, 03:22:28 pm

Quote from: Oldmanmatt on September 14, 2013, 09:17:55 pm

Given the Academic excellence prevalent on this forum, do we not have a theoretical Physicist to make us all feel dumb?

If you're there, are you really going to let an Engineer (a mere Technician) and a Philosopher (dreamer) indulge in such speculation?

Come! Enlighten us poor fools!

What is the most likely avenue of research; that may lead to teleportation.

Given that humanity has been unable to sort out Traffic congestion or devise a computer which does not crash every five minutes or explain Women.

And starts wars because " my mythical, fictitious deity is bigger than your mythical, fictitious deity and he/she/it loves us all more than yours does and really I'm killing you for your own good"....

My Windows laptop hasn't slowed down or crashed in the 3.5 years I've had it.

Oldmanmatt · May 10, 2014, 04:20:35 pm

Quote from: simes on May 10, 2014, 03:22:28 pm

My Windows laptop hasn't slowed down or crashed in the 3.5 years I've had it.

It's a Miracle!

He's the Messiah!

😆

tomtom · May 10, 2014, 05:12:48 pm

And everyone knows that homeopathy is the discipline most likely to lead to teleportation. duh!

slackline · May 12, 2014, 11:41:17 am

Spiegelhalter D (2014) The power of the MicroMort. BJOG: An International Journal of Obstetrics & Gynaecology Volume 121, Issue 6, pages 662–663 (open-access)