Monday, April 27, 2015

Playing With Data - Quadratic Cameos

I haven't been blogging much the past few years in large part due to being somewhat removed from the science/skeptic/education scene. The past 3 years I've been working in estate jewelry and am currently functioning as an inventory manager. While it is certainly far afield of my main background, I do frequently find ways where I'm applying my scientific background.

Today was a good example of that. One of our buyers has a fondness for cameos. Unfortunately, cameos just don't sell well anymore. There's some rare exceptions, such as the one pictured here. This one happened to be an 1800's piece in a 22k gold mounting that was in spectacular condition.

But most aren't. The most common ones are cameos carved from shell that I refer to as the "profile of the homely young lady". They often come in a 10k gold mounting and if we try to sell them at auction, they often sell for roughly the value of the gold in the mounting. Then, after auction fees, we've made less money than if the cameo was simply pulled out of the mounting and the mounting melted. Thus, when buying, it's important for our buyers to know roughly how much of the weight of a piece is shell, and how much is gold.

I've been collecting the cameos that we've pulled out of the mountings for several months now and have a good collection, so I put some data together today and figured that this could be a good project for a math class, looking at a few types of functions.

From each piece in my collection, I took 3 pieces of information: The height, the width, and the weight. Ideally, I'd have taken another, the thickness, but this is somewhat harder to get at since, in a real world application our buyers would be facing, they would likely not be able to easily measure this. Additionally, I make a weak assumption that this doesn't really change much. After all, even for small cameos, they'll still need to be fairly thick, or risk breaking. So I felt ok leaving this out.

My first pass I tried putting together equations from just single pairings of the height vs weight, or the width vs weight. Before graphing it and letting Excel do the fit for me, it bears some thinking about what the plot might look like. It certainly wouldn't be a linear equation because what we're really looking at is an increase in volume which is length x width x thickness, which would mean it should scale towards the 3rd power. But because the thickness probably stays more or less constant, ad the length and width increase proportionally to one another, this means I should be looking for the data to fit a second degree polynomial, or a quadratic function.

And sure enough, when I plotted everything up, it ended up coming out pretty well.

The thing I really like about data like this is that there's lots of little things you can see by looking at it. The first thing that I noticed (I actually noticed it while taking the data) is that there seems to be several somewhat standard sizes. You can see this borne out on the graph because there's several little vertical groups. I hadn't really considered this before, but there's probably a good reason for this. As with many things in jewelry, there's often a sort of "mix and match" that goes on. Customers could pick the carved cameo they wanted, and then separately pick out the mounting they liked. If there were standardized sizes, this means that jewelers can insert them fairly easily.

Another thing that jumped out at me, this time from the graph, is that there is more scatter towards the larger cameos. If this were something like astronomical data where this was a plot of the recession velocity of galaxies as a function of distance, I would expect that the larger scatter would be due to larger uncertainties in the measurement at larger distances. It would look about the same. But that's not the case here. In fact, the uncertainty in measurement should actually go down as you get to larger heights. I was measuring in mm, so if I were 1mm off, this would be a large error for the small cameos, but becomes rather insignificant towards the larger ones.

So where is the breakdown? It's likely based on the assumption I called out earlier; the cameos aren't all a consistent thickness. This gets magnified as you get towards larger cameos because the variation in thickness is getting amplified by the rapidly growing surface area.

Which brings up another question. I first did this in just one dimension - the height. I had another hidden assumption in there, is that all the cameos are essentially the same overall shape. I only selected the oval ones. None of the ones with clipped corners or heart shapes. But do they really all have the same ratio of major and minor axes? If they don't, then perhaps I'm missing something and that could be the reason for the scatter on the right of the above graph.

To try to minimize that difference, I looked at the area. Kind of. Instead of going through the full calculation to find the actual area of an oval (A = pi x (major axis)/2 x (minor axis)/2), I figured the pi and the "over 2"'s would be common factors, so I simplified this down to just the height x width. Plotting those up vs the weight gave another graph. Again, before looking at the next graph, consider what sort of fit this should be.

If you guessed linear, you guessed right! So what are we learning from this graph?

We still see the large scatter towards the high end. Similarly, if you look at the R^2 value (the residuals), you see that it's only slightly lower than for the simple one dimensional plot. This is a good indication that the scatter on the previous graphs is not caused by significantly different shapes. But aside from looking at the graph, there's a better way to check. The best way is to simply divide the height of each one by the width and see if there's much difference. I did this and found it was very consistent, right around a ratio of 1.3:1.

So how will I use this from here? Probably as a tool to give my buyers in the future. I'll likely spare them all the math that I used to come up with this, but giving them the final graph, they should be able to fairly easily use this to estimate how much of the weight of an item is gold and how much is shell that will get stuck in my giant bag and isn't being turned into money. Perhaps then they can stop paying too much.

Tuesday, April 21, 2015

That's Not How Engagement Works

Let's say I'm a website designer for a company. I'm hired to produce a new website, a better website. It's hard to say exactly what defines a good website and I don't want to have to put together a huge poll of users asking for feedback. I just want to work with what I have available. Namely the analytics my ISP or Google or some other company provides.

One of those can be loosely defined as "engagement". Are users staying on the site longer? Are they clicking things? Do they visit more than just the homepage?

In an ideal situation, you'd hope the answer would be "yes" to many of these things. However, just because the answers are "yes" doesn't mean that it's a good website design. Rather, it could imply a very bad design; a website that is too confusing causing people to have to hunt for the information, making them click on more things for more pages and staying longer.

So simply looking at a situation in terms of a metric like this doesn't give the whole picture.

And the same applies for education where student "engagement" is often used as a proxy for good education. There's good reason for this. If students are engaged, studies show they retain more of the material. However, if the material is poorly constructed, then "engagement" may be more of a desperate attempt to rectify this. Worse, if the material is downright wrong, the students will likely still retain it thus, being a net negative on their education.

It seems several teachers in Louisiana don't understand this concept. In a stunning letter unearthed by Zach Kopplin, teachers state that a law passed in 2006 which led to "...students invariably get more involved in the lesson which leads to better discussion and in turn to a higher level of achievement...".

Sounds good, right?

The only problem is that the law has opened the door for Creationism, climate change denial, and any other pseudo-scientific trash politicians want to sneak into the classroom which is what these teachers are championing.

But this isn't how engagement, at least as a meaningful metric for academics, works. I recall a discussion in which I and many of my classmates were very involved in in high school. The teacher (thankfully not a science or history teacher) was explaining why she thought the moon landing was a hoax. Oddly enough, this was one of my first big encounters with pseudo-science and it was what led me personally, to do more research. It's what introduced me to Phil Plait's "Bad Astronomy". So in this one case, it ended up being a positive. However, a few years later in my American History class we had to do presentations. I did mine on the space race and moon landing. Wouldn't you know it, there were questions about whether it had been faked.

Although my History teacher wasn't the one espousing moon hoax nonsense, I recall other spirited discussions in his class regarding the Kennedy assassination. This teacher was a big fan of conspiracy theories regarding this event. And students knew it. Thus, many times students would try to get him off topic, wasting valuable class time, by engaging him on this topic. This is another example of how student involvement can be a poor metric.

Thus, it is quite disappointing that so many teachers would defend bad science by perverting what can be a useful metric. But as the computer geeks say, "Garbage in - Garbage out."

Monday, April 20, 2015

HST's 25th Anniversary and Documentary

This week marks the 25th anniversary of the launch of the Hubble Space Telescope. This Wednesday night, PBS will be featuring a documentary, Invisible Universe Revealed which will look at the history of this amazing instrument.

I'm looking forward to seeing this documentary, not just because I love the HST, but because a post I wrote in 2007 that mentioned the Hubble caught the attention of those working on the documentary. In particular, I noted that the HST added tremendously to our understanding of stellar formation and evolution and they wanted details. I passed along several thoughts, but I doubt that the hard science will be making the final cut (no, I haven't gotten a sneak peek). So to celebrate Hubble's 25th, here's some of the thoughts I passed along.

Before I start though, I should give my normal caveat that the discovery process is often muddled in modern science and astronomy in particular. One group using one telescope may note something interesting, another using a different instrument does follow up observations, another group does the math, more observations are made by other people, and while it supports the hypothesis, not everyone is convinced and it takes years or decades to form a scientific consensus as more and more results pour in from multiple teams an instruments.

Thus, it is nearly impossible to say "Hubble discovered X". Rarely is astronomy so cut and dry. Rather, we should approach the question from the opposite direction and ask, "What observations might be needed to build and/or support stellar formation theory and has Hubble contributed to any part of that process?"

In particular, there are several things I consider as observational evidence that the theory is correct:

  • For a cloud to collapse to form a star in the first place, it will have to surpass what's known as the Jeans Mass (essentially having enough mass in a small enough space with the right conditions). While it's good sound physics, if you really want to confirm the models that rely on this are correct, you'd need to not only demonstrate that the necessary conditions of mass, density, pressure, etc... are being met, but that clouds that meet those conditions are actually collapsing. This can be done via spectroscopy by noting that the edge of a proplyd closest to you is redshifted (i.e., it's collapsing towards the center which is further from you) while the more distant edge is blueshifted (i.e., it's collapsing towards the center which is closer to you). Indeed, Hubble did just this.
  • Once a larger nebula has begun to fragment, proto-stars should develop inside the proplyds. Hubble was not the first to observe propylds. In particular, the Infrared Astronomical Satellite (IRAS) launched in 1983, had previously discovered them (for example, here is a paper on them from 1989 although the term "proplyd" had not yet been introduced). However, it was Hubble observations that really did the heavy lifting on proplyds that things seemed to take off from the HST observations. In particular, visual observations from the Hubble seemed to be what determined that these weren't just clumps, but were flattened which is a sign that they're rotating and forming disks as predicted by stellar formation theories. A major paper on this was published in 1994 by O'dell and Wen.
  • For stellar formation to work, forming stars will need to find a way to overcome the conservation of angular momentum which requires that as a cloud collapses, it would "spin up" and would result in it flinging itself apart (like a child on a merry go round spinning to fast). Several methods are proposed to do so, but one of the most pronounced is shooting out excess material at high velocities through jets perpendicular to the disk. Such jets have been known since the late 1800's (they're quite large and relatively bright in an astronomical sense since the ejected material slams into the larger interstellar cloud around it at high velocity). The jets themselves are known as Herbig-Haro (HH) objects and at their centers, we often find extremely young stars such as T-Tauri objects. T-Tauri objects had long been recognized as a type of variable star, but again, Hubble seems to have been the first to zoom in on them sufficiently to see their structure. Much like the proplyds, they were discovered prior to the Hubble era, but this 1999 paper suggests that their actual structure hadn't been resolved in detail until the HST. In particular, that paper indicates jets were discovered in some of these objects and points to other papers in which jets were discovered in such objects thanks to the HST.
  • Another important clue is that we find young stars in places that we expect them to be forming; namely, in dense dust clouds. The problem is that it's hard to see into these clouds to confirm this. In 2009, the HST got a very nice upgrade with an infrared camera that allowed it to peer through the dust and see these young stars still in the shrouds. Again, this wasn't entirely new. The Spitzer Space Telescope had been launched 6 years earlier, but Hubble was definitely a contributor.

Those are really the main pieces of evidence I'd want to see to be convinced our models of stellar formation were correct. However, there's one more way to look at things: Stellar evolution is a very hard theory to really prove because we don't get to see a star's life from start to finish. Even the births are hidden inside dense nebulae and proplyds and take hundreds of thousands of years. Trying to study this field is like taking a quick hike through a forest and and trying to figure out the entire life cycle of a tree. You can probably do it because you can see saplings to adult trees to rotting logs. But if your hike is too short, you won't have seen enough to really have a coherent picture.

Prior to the Hubble, we'd walked on trails along the edge of the forest, but Hubble took us deep into its heart. It's not always important that we saw new things or saw them for the first time. It's also important that we just saw more of them; enough to really be sure that we had seen all the steps in the process and that it was always consistent. That's not nearly as glamorous, but in science, that's quite often even more important.