DP Examples 3  1 hr per day April, practice
Poisson’s distribution from coursera notes
This is from the lecture on poisson obtained from here.
At any rate, the (Poisson)[] (distribution)[] is used to (model)[] (counts)[].
The Poisson (mask function)[] is lambda to the x, e to the negative lambda, all over x factorial where x is defined on the nonnegative integers. Zero, one, two and so on.The mean of a Poisson random variable is lambda, this parameter here. And the variance of this distribution is also lambda.
So that’s an interesting thing to take into account when you model things is if they’re a poisson.
The (mean)[] and the (variance)[] (have to be equal)[], which if you have repeated poisson data, this is a checkable assumption. I’d like to get some instances where we use the poisson distribution.
Modeling count data, Modelling eventtime or survival data modeling contingency tables, Approximating binomials when n is large and p is small
The (Poisson)[1] (distribution)[2]:
 Used to (model counts)[3]
 The Poisson (mass function)[4] is $ P(X = x; \lambda) = \frac{\lambda^x e^{\lambda}}{x!}$ for $x=0,1,\ldots$
 The (mean)[5] of (this)[6] (distribution)[7] is $\lambda$
 The (variance)[8] of (this)[9] (distribution)[10] is $\lambda$
 Notice that $x$ ranges from $0$ to $\infty$
For 1 and 2, we think of the plot of “number of plane crashes in a given year” form 1995 to 2018.
For 3, we think of problems like below:
The number of people that show up at a bus stop is Poisson with a mean of $2.5$ per hour.
If watching the bus stop for 4 hours, what is the probability that $3$ or fewer people show up for the whole time?
For 4, we think of
https://www.youtube.com/watch?v=8x3pnyYCBto start here on day 2
Poisson distribution introduction JB statistics
A (poisson’s distribution)[1] is another (important)[2] and (discrete)[3] (distribution)[4].
For 1 we think of a the following graph below:
, where \lambda is the mean (explained below).
For 2: When we want to estimate the probability that 3 or fewer plane crashes will occur this year we can use the poisson’s distribution to determine it.
For 3, number of plane crashes per year being equal to 3
For 4, we think of the following graph of probability vs outcome for a die roll
Claim: [1] is [2]
[1] is [2] because it allows to predict probability of plane crashes in a given year.
Claim: [1] is [3]
Poisson’s distribution can be used for identifying the probability of number of plane crashes in a year (which is discrete, 2 plane crashes, 3 plane crashes, 10 plane crashes).
Example: Asteroids with a diameter of at least 1 km collide with the earth at a rate of approximately 2 per million years.
What is the (probability)[5] that in a (randomly selected)[6] million year period, there is exactly one collision?
For 5, we think of a die roll and the (chance of getting an outcome)[5a] 5 is 1/6.
For 5a, we roll a die a 10k times, and plot the number of times each outcome (1,2,3,4,5,6) has occurred divided by 1000. This should be 0.16
For 6, we think of a
Claims: No claims here!
Suppose,
 (events)[7] (are occurring)[7a] (independently)[8] (in time)[8b].
For 7&7a, we think of a plane crash.
For 8, we look at the plane crashes in 2009.
Event 1: A US airways flight ditches into the Hudson River just after taking off due to total engine failure, due to multiple bird strike
Event 2: Another plane crash occurs due to engine failure, but due to an overweight take off, which the engine was unable to handle.
Event 1 is would have happened irrespective of Event 2 happening or not. Conversely, Event 2 would have happened irrespective of Event 1 happening or not.
for 8b, we think of the year 2009.
For 7 we think of a coin tosses ending up in heads or tails.
For 7a we think of tossing a coin and getting Heads.
For 7&7a, we think of the outcome being 1 as a result of a die roll.
For 8, we think of 2 coin tosses. Probability of coin toss 1 getting Heads is 0.5. Probability of coin toss 2 for Heads is 0.5. i.e., coin toss 2 is not affected by coin toss 2.
For 8b, we think of coin toss 1 and coin toss 2 happening one after the other after 1 second.
For 7, we think of 3 cars in 1 hr crossing Street X.
For 8&8b,
When you take an unbiased coin, the (probability)[8c] of H is 0.5 in the first hour. The probability of H is 0.5 in the second hour and 3rd hour as well.
Similarly, we think of the (probability)[8d] of 3 cars passing in the first hour and the second hour and third hour and the fourth hr is exactly the same. Just because 3 cars passed in the first hour, doesn’t mean the probability changes in the 4th hour.
For 8 we think of 3 chocolate chips in a randomly selected volume of 1cm^3 in a dough mixed with chocolate chips. We see here that the probability of finding 3 chocolate chips in a randomly selected volume is X, When we repeat this experiment the probability is still the same.
Let’s consider another example where we add fresh set of chocolate chips, on the dough and do not mix. The probability of finding 3 chocolate chips in 1cm^3 is Y. It appears that when we take another volume to find if there are 3 chocolate chips, the probability of finding 3 chocolate chips is still the same.
For 8c, we think of 500/1000; 1000 coin tosses and expect 500 of them to be heads.
For 8d, we think of 500/1000; out of 1000 hours we expect 500 of them to have exactly 3 cars passing Street X.
Is this “good enough”? Can you give some feedback here. I use more examples for myself to “understand” the “difference”. It is very hard to give examples of independence in the context of poisson’s distribution. All examples also are made up.
 the (probability)[9] that (an event occurs)[10] in a (given length of time)[11] (does not change through time)[12].
For 10, we think of a plane crash.
For 9, we think of the probability being
For 11, we think of one year time period
For 12 we think of the period between 2009 to 2019.
If P of a plane crash is 1 in 11 million in one year. Let’s say 3 plan crashes occured this year already, it doesn’t mean that the probability of a plane crash i
I am not sure what this means, I have guess the meaning. Somehow, the sources that I look at, i.e., youtube, books, do not like giving examples for each of the above case. So in the end I am just left with an interpretation. With repect to a Die I have it figured out, with respect to Basketball I understand it, but with plane crashes I just cannot give an example, because I don’t really know what probability means.
For 10, we think of rolling a die and getting a 5 as outcome.
For 9, we think of 1/6 being the chance of getting a 5 when a die is rolled.
For 11, we think of 1 toss
For 12, we think of 10 tosses, 100 tosses.
Claim: probability of 10 in 11 does not change with time
The “probability” of getting a 1/6 irrespective of the many die tosses is always 1/6. Doesn’t matter if you get 1/6 in the last try.
Here the only known definition of prolty is that when I roll it 1000 times I see that it is 1/6.
In other words Lebron James has a free throw percentage of say 75% throughout his career. Let’s say he’s at a game and he attempts 19 free throws and he makes 14 of them. Now does this mean on his 20th free throw that there is a 100% chance of him making it? No it still means that there is a 75% chance of making it. The only thing this 75% says is that if you make 10k FTs you are expected to make 7500 of them. That is all it says. It’s all about long numbers.
A case when this is violated is: Say you have 2 prize tickets in a raffle of 10 tickets. P(getting a prize) = 2/10. Once the 1st prize is given, the P(2nd prize) = 1/9 = 0.11. The probability changes through time.
For 10, we think of the Cars passing Street X.
For 9, we think of 3 cars passing street X within 1 hr.
For 11, we think of 1 hr
For 12, we think that in no way if 3 cars pass this hour, that somehow only 0 cars will pass the next hour. With every hr the probability is assumed to be the same.
At this point it appears that both the statements seem to meaning the exact same thing, as the examples provided are exactly the same!
Then X, the (number of occurrences)[13] in a (fixed unit of time)[14], has a (Poisson distribution)[15]:
$P(X = x; \lambda) = \frac{\lambda^x e^{\lambda}}{x!}$
For 13, we think of 3 cars crossing street X
For 14 we think of 1 hr
For 15 we think of
Claim: If events are occurring independently in time and the probability that an event occurs in a given length of time does not change through time, then Number of occurrences is a Poisson’s distribution.
Number of occurrences of plan crashes in a year follows a Poisson’s distribution apparently. We see that number of plane crashes in 1984
$\lambda$ is nothing but the (mean)[16] of (number of occurrences)[17] in the (time frame we are talking about)[18]. Not only the (mean)[19] but also the (variance)[20] is equal to $\lambda$.
For 16&19: If 3 cars pass in the first hour, 2 cars in the next and 5 cars in the next and 2 cars in the next, then the mean is 3 cars per hour.
For 17 we think of 3 cars passing in the first hour.
For 18, we think of 1 hr
For 20, we think of this:
((33)^2 + (23)^2 +(53)^2 + (23)^2)/4 = 1.5
Claim: Not only the mean but also the variance is $\lambda$ (in a Poisson’s distribution)
> a < rpois(10000,3) # 10k random picks from poissons distribution with mean 3
> mean(a) # MEAN
[1] 3.005
> var(a) # VARIANCE is almost the same as mean!
[1] 2.947466
Lorentz actuator
(Lorentz actuators)[1] are (predominantly)[2] applied in (high precision)[3a] (positioning systems)[3] because of their (inherent low mechanical stiffness)[4] between the (stationary)[5] and the (moving part)[6].
For 1, we think of the following picture where a current carrying coil will move right or left when it is placed in a magnetic field.
For 2, we think of every stage used in every type of ASML Machine, there are atleast 6 Lorentz actuators
For 3, we think of a WAFER moving stage in ASML machines, needs to move 0.5um with 50 m/s^2 acceleration.
For 4, we think of a long loose rope connecting a ship and you. When the ship moves, you will feel nothing until the rope becomes tight. Similarly, let’s say the coil is moving at 1mm/s to the right. If I move the yoke by 1mm to the left, the coil will continue to move to the right at 1mm/s simply unaffected (provided the magnets always cover the coil as shown in the above picture).
In ASML, typically vibrations in the order of nanometers from the yoke, will not be passed on to the coil due to the low mechanical stiffness.
For 5, we imagine the yoke being fixed with bolts to the ground
For 6, we imagine a current carrying coil moving.
Claims:
[1] is applied in 3,3a because of [4]
Lorentz Actuators are used in ASML XY position stages, primarily so that disturbance (running motor causing vibrations) from the outside world does not get transferred through the yoke to the moving coil, aka low stiffness like a loose rope.
Also the (linear relation)[1] between (current)[2] and (force)[3] combined with the (favorable)[5] (dynamic properties)[6] are (important factors)[7].
For 1, think of the relation below where if current increases by 1N, then Current will increase by say 1 A (assuming BL=1N/A)
F = L I B;
L windings
I current
B magnetic field
For 2, think of 1A in the coils
For 3, we think of 1N push on the coil of the actuator
For 5 & 6, we think of the low stiffness of say 0.001N/mm , like that of a “loose” rope connecting a ship and I.
not sure about this number or even if this is how it works as none of the data sheets seem to show what this stiffness is for lorentz motors
For 7: we imagine the ASML stage, which needs to move 1nm and stay there within + 1nm. If a machine is running next to it, it will change the position of the stage from 1nm to 3500 nm. Having a lorentz motor allows the stage to stay at 1nm+1nm even though machines next to it are moving by 3500 nm.
Claims:
5,6 are “important factors”
Having low stiffness we imagine the ASML stage, which needs to move 1nm and stay there within + 1nm. If a machine is running next to it, it will change the position of the stage from 1nm to 3500 nm. Having a lorentz motor allows the stage to stay at 1nm+1nm even though machines next to it are moving by 3500 nm.
1 is an important factor.
Why exactly a linear relation helps is not known to me. It probably has to do with linear control and nonlinear control, but I need to talk to folks to get this info. Here I know that people in the industry generally fear nonlinear systems.
The (low stiffness)[1] (reduces)[2] the (amount of external motion)[3] that is (transferred)[4] from the (support structure)[5] through the (actuator)[6] to the (moving part)[7] ((transmissibility)[8]!).
For 1, we think of the made up 0.001N/m
For 3, we think of 3500nm
For 2, we think of 3500nm to 0nm
For 4, we think of stage being moved to 3500nm from 1nm in case motion is transferred.
For 5, frame of the machine in which the stage is
For 6 we think of the loretz actuator discussed above
For 7 we think of the moving coil connected to the stage
For 8, we think of 3500nm from the external machine to the stage reduces to 0nm, which means there was no transmissibility (aka 0).
Claims
[1] reduces [3] transferred to [7]
ASML regularly uses lorentz actuators to avoid disturbance in the order of nanometers, i.e., 3500nm of movement to 0nm at the stage.
Maybe I need to write about how the transmissibility is 1, well it is also actively controlled!
I agree that these numbers are made up out of my intuition. I need to talk to some people on monday to understand what is their expectation or understanding. I can speak to Niels, maybe lakerfeld or verboom See who is approachable and ask them some questions.
(These movements)[1] can be caused by (vibrations of surrounding machines)[2] but also by the (reaction forces of the actuator)[3] itself, (exciting resonances in the support structure)[4]. As will be shown later, (Lorentz actuators)[5] have also some (drawbacks)[6] like the (relatively modest force to current ratio)[7] which (limits the maximum acceleration levels)[8] and the (achievable range of motion)[9] or “stroke”. As the name implies, the (Lorentz actuator)[10] is based on the (Lorentz force only)[11].
For 1, we think of turbopump running at 80000 rpm present in the same machine as the stage shaking the parts next to to 10 microns.
For 2, we think of the 10 microns of movement from the turbopump running in the adjecent machine which is 1m away.
For 3, when the stage accelerates with 50m/s^2 and weighs 1kg, it creates a force of 50N on the Yoke as well.
For 4, we think of a turbopump in the same machine as the stage.
For 5, we think of
For 6, i dont’ have an example for its drawbacks, other than in theory, as mentioned in [7] and [8].
I don’t need to understand these documents to the letter, unfortunately. Although it is clear how much I suck at it now that I try to give an example. The very foundation that it has low stiffness and what in the actual fuck it means is absolutely not clear. What is limiting the acceleration is not clear, maybe the price of magnets? or the amount of magnetic field and windings!
The Dutch (physicist)[1] and (Nobel prize winner)[2] Hendrik Antoon Lorentz (1853 – 1928) formulated the (Lorentz force)[3] as a (completion)[4] to the (Maxwell equations)[5].
For 1, we think of Albert Einstein, and his “theory of relativity”
For 2, we think of Albert Einstein once again
For 3, we think of the force in the coil in the below picture when the yoke is held in place
For 5, we think of these equations
Not sure if I am expected to come up with other things than the equations, an example perhaps of the usage of one of the equations?
For 6, I don’t know what he meant by completeing and ho
Claim
[3] is a “completion” to 5? come back to it!???
The (law of Faraday)[6] describes the (effect)[7] of a (changing magnetic field)[8] on (electrical charges)[9] hence (generating electricity)[10] from (kinetic energy)[11].
For 6,
how can you give an example for the law?
For 8, we imagine a fixed point and a magnet that moves away from and towards it, like below. Say it changes from 5T to 40 T in 4s
For 9, we think of a copper wire loop with a resistance of 2 ohms.
For 10, we think of the electrons moving in the loop and as a result showing 2.5 A on the amp meter.
For 11, we think of electrons moving in the loop with a velocity.
I am not giving an example for 11, as it will take more than an hour to come up with the right example, as I have to redo my fundamentals in current. I leave this for now and move on with knowing that I have failed to provide an example here.
Claim
[8] on [9] generates electricity from [11].
Based on this, When the magnetic field changes from 5T to 40T in 4s, then we have a current generated of 2.5A as a result of the charges moving.
Based on (energy conservation laws)[12] (creating electrical energy)[13] from (motion)[14] is (fully complementary)[15] to (creating motion energy)[16] from (electrical energy)[17] so the laws of Lorentz and Faraday are (strongly related)[18].
For 12, we think of “(the total energy)[12a] of an (isolated system)[12b] (remains constant)[12c]”
For 12b, we think of a pendulum on a frictionless joint (roller bearing), we lift the ball of mass 1kg to a height 1m.
For 12a, sum of kinetic energy and potential energy = mgh
For 12c, you let go of the pendulum,
At that point, KE =0 and PE = mgh ; total = mgh
At the mid point the KE = mgh, and PE =0; total = mgh
At the other extreme the KE = 0 and PE = mgh; total = mgh
For 13, Current of 2.5 A as a result of varying magnetic field from 5T to 40T in 4s.
For 14, we think of motion of the magnet to produce a change from 5T to 40T.
For 15, I don’t know what it means, I suspect it is fluff!
For 16, we think of the moving coil in the lorentz actuator
For 17, we think of the current in the coil
For 18, I do not know what is the point and what they mean by that.
Claims
13 from 14 is 15 to 16 from 17
No idea what complimentary might mean in this case!
the Lorentz and Farady are “strongly related
No idea!
Motion control Mechatronic system design
As was presented in the previous chapters, (most)[1] (mechatronic systems)[2a] are (actively controlled)[2] (motion systems)[3], which (implies)[4] that (these systems)[5] are of a (dynamic nature)[6]. If the (mechatronic system)[7] to be (controlled)[8] (does not show any dynamics)[9] within the (required)[10a] (positioning bandwidth)[10], the (entire control problem)[10c] becomes (quasi static)[11] and is therefore (trivial)[12].
For 1a, we think of pick and place robots, wafer scanners
For 1, we think of every wafer stage (made by ASML).
For 3: At ASML, in one of the products, there is a stage which needs to move at 40g’s and expected accuracies of positioning are 1/10 nanometers. This stage is also connected to a cable slab.
For 2: The stage defined above moves from 0 to 100mm within 0.02s. The cable slab also follows it. When the stage stops the cable slap continues to oscillate and is capable of moving the stage in the order of mm and will not settle in 10ms. A controller, actively reads the position every 10000th of a second and provide movement to the stage so as to counter act the motion introduced by the cable Slab.
For 3&5, we think of the wafer stages at ASML once again.
For 6, we think of stages with accelerations moving at 0.5m/s
For 7, we think of the wafer stage connected to lorentz actuators.
For 8, we think of the same example as in [2].
For 9, we think of a wafer stage moving at 0.01mm per second
For 10,10a, The stage is expected to move from 0mm to 300mm.
For 10c, positioning the stage at nm accuracy within 10ms
For 11, correction to the stage needs to be applied every 1ms.
For 12, applying correction every 1ms instead 0.1ms
I use the word dynamics 40 times a day, and yet when it comes to examples, I whimper like courage the cowardly dog when he sees a ghost
Claims
Most Mechatronic systems are actively controlled motion systems
ASML stages, pick and place robots we see in automotive factories
[2a] is actively controlled implies it is of [6].
ASML Stages are actively controlled and they move at ridiculous speeds of 0.5m/s [6].
if [7] does not show any dynamics[9], then [10c] becomes [11]
When a stage moves at 0.01mm/s [9] and stops at say 10mm then, positioning the stage within a resting time of 10ms [10c] can be done by reading the position every 1ms and providing correction instead of every 0.1ms[11].
[10c] is trivial[12] when [7] shows no dynamics[6]
When a stage moves at 0.01mm/s and stops then, positioning the stage within a resting time of 10ms can be done by applying a correction every 1ms instead of 0.1ms[12].
In (most)[0] (motion control systems)[1] (this)[2] is not the case and (compensation)[3] of the (system dynamics)[4] (via control)[5] is (required)[6] to (achieve the specified performance)[7] in terms of (precision)[8], (accuracy)[9] and (frequency response)[10].
For 0, with think of kuka making robots and ASML making stages
For 1, we think of ASML stages which need position within 10nanometers within 10ms
For 2, we think of stage moving at 0.01mm/s and stopping at a value say 10mm. The correction applied to the position is done at 1ms instead of 0.1ms
For 3: if a stage is at 0.03mm instead of 0.02mm, then a “proportional force” of 0.24mN is provided by the actuator to bring it to 0.03mm.
For 4, we think of a cable slab attached to the stage. When the stage stops suddenly (from 10m/s^2 to 0m/s^2 acceleration) the cable slab continues to oscillate 5 to 10mm’s. This creates a force on the stage that moves it from current position(0.03mm) to + 10nm.
For 5, we think of a controller black box which brings the stage to 0.03mm + 1nm within 10ms by varying the forces in the actuator.
For 6, we think of what happens when there is no control, i.e., the stage would oscillate with +10nm for atleast 1s (exceeding the budget by 100 times).
For 7, we think of settling of the stage, within 10ms after accelerating at 10m/s^2.
For 8, we think of the settling of the stage at 0.03 mm within 1nm, @ 0.02mm within 1nm etc…
For 9 we think of the movement of the stage by 0.01mm within 0.5nm, between any two points!
For 10, we think of an excitation at 10hz and magnitude of 1N which the cable slab gets due to the neighboring machine. Because of this the stage will move +10nm (let’s say). The controller needs to counter this deviation by moving the stage opposite to the excitation or damp this out!
Claim
In most motion control systems, [2] is not the case
With Kuka robots and ASML stages, we do not have parts moving at 0.01mm/s with a settling time of >10ms.
In most motion control systems, [3] of [4] via control is required
ASML stages are connected to cable slabs. The stages accelerate at 50m/s^2 and stop within 10mm. When they stop at 10mm, the cable slab continues to vibrate at say 150Hz with an amplitude of >1mm leading to moving the stage beyond +10nm budget. In this case, a “controller” is used to counter the motion of added to the stage by the cable slabs. If the stage moves at 1hz with an amplitude of 10nm, then the controller uses the actuator to move the stage in the opposite amplitude.
This chapter discusses the (various approaches)[1] to guide and (actively control)[2] (motion systems)[3]. As the name already indicates, (motion control)[4] is all about the (control of a machine)[5] to (follow a predefined trajectory in space)[6] and (time)[7], with (various applications)[8].
For 1, we think of (proportional control)[1a], “integral control” and “differential control”
For 1a, we think of stage at 10mm that needs to go to 20mm. The Force applied on the actuator varies with how close the stage is to 20mm.
F=k_p x (20mmcurrent position)
Here K_p is a constant.
For 2, 3 and 4,5 we think of earlier examples.
For6, we think of the stage moving to different spots on the wafer starting from the centre and moving outwards at 50m/s^2 for every 10mm
For 7, we think of the above trajectory but that needs to happen so that the stage can handle 200 wafers per hour!
For 8, I have no idea what the author is trying to say.
Claim
its a trivial claim covered above in example 6 and 7, not going to deal with it!
Examples are (precision position control)[9] with (rejection)[10] of (disturbances due to vibrations from the environment)[11] or (imperfections of the mechanical system)[12] as well as (path planning)[13] and (velocity control)[14] for (scanning applications)[15].
For 9, think of ASML stages where positions need to be within 10nm
For 10&11, we think of stages where
For 12 we think of
Statistics
The Pvalue is the (probability)[1] (under)[2a] the (null hypothesis)[2] of (obtaining evidence)[3a] as (extreme)[3] or (more extreme)[4] (than that obtained)[5]. If the Pvalue is (small)[6], then either ($H_0$ is true)[7] and (we have observed a rare event)[8] or ($H_0$ is false)[9].
For 1, we think of getting a sample with mean of difference in heights of fathers and sons = 0.99, 1 time out of 100000000 samples picked; as the probability is 0 in this case.
For 2, we think of the difference in height between father’s and their son’s to be 0
For 2a&2 together, we think of, assuming that the mean of the father’s height be equal to the mean of the Son’s height
For 3a, we think of the dataset available in R called “father.son”, which contains heights of fathers and sons in pairs.
For 3, mean of difference between the pairs is 1 inch ( in the sample we have while we expect it to be 0 inches + 0.08 inches (std. deviation/ sqrt(n))
For 4, we think of mean of difference >1inch
For 5, we think of mean of difference is 1 inch (which is what we got from the sample)
For me writing the tstatistic exclaims the extremeness, but I am not sure how to write it as I have to explain a normal distribution adn then a tstatistic. What is expected here? I still used a lot of jargon here!
For 6, we think of <5% in our case it is ~0%.
For 7, we think of mean of difference in father and son’s height as 0 for the entire India (population)
For 8, we think of a sample where the father’s drink a potion that decreases their height by an inch approximately, after their son’s are born. i.e., chance of this sample being picked <5%.
For 9, we think that the mean of difference in father’s and son’s height is != 0.
Claims
pvalue is the probability of obtaining a mean that is >= obtained sample mean.
It is a definition. I don’t see this as a claim!
if Pvalue is small then either H0 is true and we have observed a rare event
Let’s say we have knowledge that the fathers in the sample drank a potion before coming to the study that made them 1inch shorter. And that in all other cases are of the same height.
In this case, pvalue is 0 for sample with mean ‘difference in heights of father and son’. In this case, the father’s height is same as son’s height but we have observed 1inch height difference in this sample.
if Pvalue is small then H0 is false
The pvalue is ~0 (to the 12th digit), in this case, we say that there is no way the father’s and Son’s height could be the same for the whole world.
Three confidence intervals of linear regression
There are three different concepts:

The confidence intervals for the coefficients β0 & β1 in the the linear model.

The confidence interval for the regression line at a particular value x_k.
3.The prediction interval for a predicted value $$ when x=x_k
Confidence intervals for the coefficients in the model (page 4749): can be used for hypothesis testing, to see whether a relationship between x and y is likely or not.
Predict a yvalue in x_k by y^=β^0+β^1∗xk
Confidence interval for the regression line (page 4951):the predicted value lies on the regression line and is the mean of possible yvalues. Because of the random nature of your observations the regression line has a confidence interval for each x_kx k . The intervals lie as a band around the regression line and the width varies with x. You can see the confidence interval for the regression line as the confidence interval for the mean of y (\mu_yμ y ).
Prediction interval (page 4951): the prediction interval is like the confidence interval, but now for the value of y (and not for it’s mean). It also varies with x, and the band around the regression line is wider. The “real value” for y, given x= x_kx=x k , lies within the prediction interval with ..% confidence.
I guess in most situations people will be interested in the prediction interval
(Confidence intervals)[1] for the (coefficients in the model)[2] (page 4749): (can be used)[3] for (hypothesis testing)[4], to see whether a (relationship between x and y)[5] is (likely)[6] or (not)[7].
this is going to be mighty ass painful! but let’s go!
I don’t know what confi Intervals stand for
In the other case I see a bell cureve, I see a mean get it!
time
Day 1: 1.6hrs
Got no where! Tried and tried and seems like I have no idea of Poissons
but I felt like I need some formal document which starts from scratch. But let’s see!
Day 2: 2.25 hrs
Still struggling with the same statements. Trying to identify examples, but it took a >2 hrs to find examples that explain “independence over time” and the other condition for Poisson’s distribution.
Day 3:
Sources
Statistics p value, confidence intervals, linear regression, residuals, distributions, bayes stuff!
Continue the dynamics one?
MSD not sure… actuators, PID control?? there will be a lot that I don’t know.
less wrong stuff?
PG stuff?
Summary from my career guide?
What am I confused about?
What to do in life,
What about work document PIR? What about the intertia document!
How to get persuaded articel STM
80000 hours! AI, datascience
office doc (don’t publish)
The (X beam)[1] (only consist of)[2] the (Magnet Yokes)[3] for (the X and Y actuators)[4]. (No separate structure)[5]/beam (is necessary)[6]. (The magnet yokes)[7] are (bolted together)[8]. (It)[8a] should be (analyzed)[9] whether (the connection)[10] has (influence)[11] on the (magnet fields)[12] (intended in the yokes)[13]. (If so)[14], A (small nonmagnetic spacer)[15] (between the magnet)[16] (needs to be applied)[17].
For 1, we think of a weight that is 5 times the wafer stage (12kg), so that if the wafer stage moves 1mm the weight moves 0.2mm in the opposite direction. In this case this weight comes from the Lorentz actuators.
For 2, we think of the “X beam” having only the Lorentz actuators and no extra material
For 3, we think of the Lorentz actuator which consists of magnet, placed on iron as shown below (without the coil).
For 4, we think of the Lorentz actuators that move the wafer stage in X and Y directions.
For 5, we think of the Y stage where in addition to the actuators, we also need 300kg’s of mass, where as in the X stage just the actuators and their resultant weight of 60kg is “good enough”.
For 6: If the stage moves at 50m/s^2 for 10mm, it creates a force of 600N in the opposite direction(newton’s law). The Xbeam (60kg actuator) will move only 2mm, as a result of this force. This movement of the Xbeam will not exceed the volume limits given (5mm). Hence no separate structure is necessary.
For 7, we think of
For 8 & 8a, we think of of a bolt running through the magnet into the yoke, forcing them to stay together.
For 9, we think of the setup with bolts and without bolts and check the ““force band”” for a ““given current””! this has an “effect” on the “contro”
This is not an example, I get it! An example would be an actual analysis!
For 10, we think of the bolted magnets to the yoke!
For 11, I don’t know what is to be checked, how to know if it has influence or not!
For 12, when there are two opposite magnets facing each other, there is a magnetic field. In the case of the actuators we are using it is about some “X” Tesla!
For 13, I don’t have an example…
For 14, no example, to suggest what magnetic field changes are expected…
For 15, plastic small? how will this solve the problem?
Any more of this is a waste of time. It is not my forte!
11 12 The function is of the Magnet Yokes are primary for the stator part of the wafer carrier actuators, 13 secondly as a balance mass in X direction. 14 For a good balance mass function, the driving forces into X beam need to be in the CoM in both Z as Y 15 direction. This was one of the reasons for connecting one X motor to the front pillars and one X motor to 16 the rear pillar. 17 18 The connection of the Xbeam to the Y connecting bodies is done via linear bearings. 19 This linear bearings give the Xbeam the freedom to move in X direction while giving stiffness towards 20 the Xbeam in the other 5 DoF’s. 21 22 The total moving mass of the Xbeam is budgeted as 4x the Wafer Carrier mass = 4 x 12 kg = 48 kg.
statistics
Understanding the question quiz2 course 7.
In the mtcars (data set)[1], (fit)[2] a (linear regression model)[3] of weight ((predictor)[4]) on mpg ((outcome)[5]). Get a (95% confidence interval)[6] for the (expected mpg)[7] at the (average weight)[8]. What is the (lower endpoint)[9].
For 1, we think of the fuel consumption data along with 10 aspects of automobile design aka, mtcars.
For 2, linear regression fit
For 3, we think of a straight line passing through the plot of miles per gallon vs weight of the car so as to produce least error between the line and the actual values.
mpg = beta0 + beta1 *weight
This way for a given weight you can guess what will be the miles per gallon.
For 4, we think of weight. For a given weight we guess the mpg.
For 5, we think of mpg, for a given weight we guess the mpg.
For 8, we think of x0=mean(weight)=1.6
For 7, we think of the point on the regression line at the mean
For 6, we think of band about which the point in 7 varies
fit lwr upr
1 20.09062 18.99098 21.19027
For 9, we think of the lwr value.
Claims
None. It was all questions as far as I could see!
If I were to (present evidence)[1] of a (relationship)[2] between (breath mint useage)[3] (mints per day, X) and (pulmonary function)[4] (measured in FEV), you would be (skeptical)[5].
(Likely)[6], (you would say)[7], (‘smokers)[8] (tend to use)[9] (more breath mints)[10] (than non smokers)[11], smoking (is related)[12] to a (loss in pulmonary function)[13]. (That’s)[14] (probably the culprit)[15].’
For 1, we think of a linear regressor between breath mint usage and pulmonary function, such that 90% of the points lie within 2% in the Y of the estimated linear regressor. More like error = something also the following cases!
For 2, we think of a 60 degree line plot of breath mint usage in X and pulmonary function in Y, i.e., if breath mint is x0, then y0 = beta0 + beta1*x0. (linear relationship)
For 3, we think of mints per day
For 4, we think FEV measurement
For 5, It could be that mints per day is a direct result of smoking which could be the actual reason for the measured FEV, for example!
For 6&7, you would guess, think or “naturally believe” as “probable cause”
For 8, People who smoke more than 10 cigarettes a day
For 9, I am not even sure this should be highlighted!
I want to leave out all the bullshit and read just the main words, like leave out tend to use, you would say and all that!
For 10, we think of breathmints_smokersbreathmints_nonsmokers>3
For 11, we think of people who smoke less than 1 cigarette per day.
For 12, there is a regressor line with y varying by 2%
For 13, FEV
For 14, Smokers taking in “more” breath mints that average joes!
For 15, we think of the regression line between breathmint usage and FEV and conclude wrongly that breathmints cause FEV.
Claims Moving on?
(If asked)[1] (what would convince you)[2], (you would likely say)[3], ‘If (nonsmoking breath mint users)[4] had (lower lung function)[5] (than)[6] (nonsmoking nonbreath mint users)[7] and, (similarly)[], if (smoking breath mint users)[8] (had lower lung function)[9] (than smoking nonbreath mint users)[10], I’d be (more inclined)[11] to (believe you)[12]’.
For 1, no examples are needed.
For 2, we think of a regression where the effects of smoking is removed from the data. For example, if we use the data for nonsmokers only who use and don’t use then it gives more confidence that the hypothesis initially mentioned is right!
For 3, yeah! what is likely? most probably! Having been trained a bit in statistics, we know that correlation is not everything. the effect of other possible regressors needs to be removed!
But what does likely mean? No idea!
For 4, doesn’t need an example as well
For 5, we think of FEV levels that are “low”
For 6, we think of non smoking breath mint users, having an few of “low” and the non smoking nonbreath mint users > than “low”.
For 7, doesn’t need an example
For 8, we think of the two arguments
 smoking breath mint users….
 nonsmoking breath mint users!
For 9, we think of “low”
For 10, we think of > “low”
For 11, we think of “pursuing this line of thinking” that breath mints actually cause lung issues.
For pursuing, I would think of checking if there are other variables such as people flossing or not, contributing to the use of breath mints.
You are never a 100% sure. Atleast smoking seems not to cause the issue, but flossing could, so we pursue that line of thinking i.e., look at the data on flossing peop
Check if people who floss and take breath mints and bla bla bla… I understand flossing has nothing to do with lung function or whatever!
For 12, we think of the same think, pursue other variables and right now I don’t fully understand the effect of keeping one variable and removing the other. Removing all variables sounds like a plan as lm does! but of course bcaffo said you need to play with the variables
 In other words, to even consider my results, I would have to
demonstrate that they hold while holding smoking status fixed.
Summary of by understanding of 80k
So at this point having learnt new techniques to check my understanding, I put it to test on text that I have written earlier. One such text is the 22k article I wrote based on the english and some impact calculations I did! God help the lack of study level data!
(I think)[1] the (DW)[2a] (potential)[2] with (SE)[3a] is very (poor)[3]. (It)[4] does not give me (skills)[5] in statistics or (research)[6]. Based on my (current skill level)[7] I do not think I could (ever)[9] (work)[10] in (places)[11] like MIRI either.
For 1, I think of the time I did “research” or “read through articles” “based on which” I form my “opinion”.
I mean why do you think bla bla bla. Is it really important I talk abut “i think”? if I just talked about 2 and 3, I don’t skip anything important like “if DW is “better” SE”
Yea no more I think words shall be respected. We move on I guess.
For 2, I think of number of lives saved, which is the only goddamn thing we care about!
For 3, we think of how SE, can make a lot of money (working at Snapchat paying 300k$), but does not get you skills like in statistics or management, as compared to say working in DS. Statistics is almost life here. Critical thinkging aka hypothesis and rejecting is the norm here. Scientific way is the shizz here!
For 2a, we think of working at Givewell at a starters position like an analyst and create an impact of 500 to 1.4k lives
For 3a, we think of working in google for as an SE writing python/C code for their softwares like their search engine, for a salary of 200k$ within the next 5 years.
For 4, I think of SE as described the previous example.
For 5, we think of (“skills in statistics”)[] and (“researching”)[]
For 5a, like listed in the GiveWell website, cause that’s what they are looking for!
For 5b, like listed in GiveWell website
For 6, we think of “stuff” listed in GiveWell website!
For 7, we think of barely writing 5 lines of code without referring to stackoverflow, barely knowing python and just starting with R, lack of ability to write software?
For 9, I think of within my lifetime
For 10, I think of being a researcher
For 11, we think of MIRI
Claims
(DW)[2] (potential)[3] with (SE)[4] is (very poor)[5].
For 2, we think of working at GiveWell
For 3, we think if # of lives saved by getting a job at GiveWell is ~1000
For 4, we think of working at google making softwares like google maps
For 5, we think of working at google on google maps, without learning any statistics that is very important for GiveWell.
An example of the claim is: Number of lives saved by getting a job at GiveWell, as a result of working at google on building google maps, does not give us the knowledge in statistics (causal inference, limitations of experimental methods), which could in turn get us a job at GiveWell which could save about “1000 lives”” over my lifetime.
(It)[1] does not give me (skills)[2] in (statistics)[3]
For 1, we think of working in C and python building the platform for google maps
For 2, we think of knowing when to use what variables to model miles per gallon vs number of cylinders, for example, In the mtcars dataset, we want to look at outcome miles per gallon and the influence of the predictor: number of cylinders. If you don’t use any of the other variables in your model, you get a slope of
it does not give me skills in research
Why not start with claims and then identify the roots, so that it gives you a heads up on what you need to give examples for, otherwise it feels like pandian is flying in the dark!
SE (could work)[] for (ETG)[] (if it allows me to startup)[] or (somehow it allows me to move into finance)[]. But (we all know)[] that despite the (impact of a startup)[], the (chance of success)[] is way (too low)[]. Regarding getting into (Finance with SE)[], the (whole path)[] and (effective impact)[] is unclear. (It is still expected)[] to (fare poorer)[] than DS for DW. It looks like Software Engineering (seems nice)[] for (postretirement work)[], but considering my (EAO plans)[] or (ETG plans)[], I don’t see Software Engineering (serving me better)[] than DS or MC.
Mechanics
If the trajectory ui of the particle is such that the effectively applied forces produce no virtual work for any virtual displacement compatible with the constraints, the equilibrium is then satisfied.
The virtual work of the forces effectively applied onto a system of particles is zero with respect to any kinematically compatible virtual displacement if and only if the system is in dynamic equilibrium.
Break Cryonics down
The (essence of analysis)[1] is to (“break it down”)[2], to (take apart vague wholes into clearer parts)[3]. For the same reasons we make (point lists to help us make tough job decisions)[4], or (ask people who sue for damages to name an amount and break it into components)[5], we should try to (break down these important social claims)[6] via (simple calculations)[7]. And the (absence of attempts at this)[8] is a (sad commentary on something)[9]. — Robin Hanson a while ago
Claim: [1] is [2].
Example: Is this a definition?
essence of (analysis)[1]
For [1], we think of understanding why X wants to sue you for 20 million $.
Claim: [1] is [3].
All the above are definitions! So Skip!
Claim: We do [4], because of [3].
skip due to because!
Claim: We do [5], because of [3].
skip due to because!
Claim: We should [6] via [7].
Question: We____ [6] via [7], or else?
Example: SHOULD!
(break down)[1] these (important social claims)[2]
For [2], we think of claims such as ‘It’s faster to go by car than by tram to Aldi in Den Haag’ from my house in delft.
break down into simple calculations
For [3], we think of the time it takes to get the car out, the time it take to go there and then find a parking spot. The same we do with the tram, time to find a tram, and travel in it, in addition to walking to your destination.
Claim: Absence of [3] is [9].
Question: __ of [3] is [9].
Example: I don’t know what sad commentary on something means! Maybe the author is saying “it shows about your life”
next
Imagine you disagreed with someone about the fastest way to get from your office to Times Square NYC; you said drive, they said fly. You broke down your time estimates for the two paths into part estimates: times to drive to the airport, wait at the airport, fly, wait for a taxi, ride the taxi, etc. They refused to offer any component estimates; they just insisted on confidence in their total difference estimate.
Similarly imagine some someone who disagree about which of two restaurants was better for a certain group, but wouldn’t break that down into who would like or dislike what aspects of the two places. Or imagine someone who claimed their business plan would be profitable, but refused to break this down into how many of what types of units would be sold when, or what various inputs would cost. Or someone who said US military spending was worth the cost, but refused to break this down into which enemies were how discouraged from what sorts of damage by that last spending increment.
Such (silent disputants)[1] reject (our most powerful tool)[2] for resolving (disagreements)[3]: analysis – (breaking vaguer wholes into clearer parts)[4]. Either they have not used (this tool)[4] to test or (refine their estimates)[5], or they are not willing to discuss (such parts with you)[6].
Claim: [1] rejects [4], for [3].
Example: Mr. T doesn’t say where he has a problem with estimates, instead he just says, that holistically, it seems unlkely.
Claim: [2] is [4].
Here you either find one other tool that is “more powerful” than analysis or you give an example of how it is “quite powerful”.
For example, at my work, we are building a machine and the cost of goods is of high priority. Here we have an excel sheet showing the different costs of the each part. We look at which has the highest costs and see how we can reduce them for example.
If I am trying to dispute most powerful, then I would go with Eliezers example on how to make predictions about future based on the past. Eliezer says, estimation of time money cost are all done based on previous costs! Which is one part of it. Breaking down seems to be a part of it.
So loose claim!
Claim: either not used [4] to test/[5], or not willing to discuss [6].
I don’t know how to test for this!
Continuing Robin Hanson
More precisely, if (folks are reasonably smart about when to try to revive you)[1], your (total revival chance)[2] is (something like a sum across all future times)[3a] of (such calculations)[3], each one given (no destructive failed prior attempt)[4].
Claim: if [1], [2] is [3] of [3a], given [4].
Question: if [1], [2] is __, given [4].
Split:
For [1], we think of a time when there is a 90% chance of reviving you.
For [2], I don’t have an example!
For [3a], I don’t know what example to give
For [3], !!!
I don’t feel like I can answer this
Example:
If (you make 50K$/yr now)[1], and (value lifeyears at twice your income)[2], and (discount future years at 2% from the moment you are revived for a long life)[3], but only (discount that future life based on the chance it will happen)[4], (times a factor of 1/2 because you only half identify with this future creature)[5], then the (present value of a 5% chance of revival is $125,000)[6], (which is about the most expensive cryonics price now)[7].
Fun Fact: I wrote a comment on this blog, a [2] years back, of course I didn’t understand shit! Let’s see!
Claim: If [1], [2] and [3], but only [4] * [5], then [6]
Question: bla bla bla __
Split
For [2], we think each year we are able to live after death is worth 100k $/yr now.
That’s a lot???
For [3]: This is my interpretation from the shit that I think is his writing
Year 1 after being revived = 100k, year 2 = 98k and so on…
For 4, assuming a 5% chance of revival, does he mean:
year 1 5% of 100k, year 2 = 5% of 98k or does he mean :
year 1 100k, year 2 = 5% of 2k = 100 $ so life in year 2 is 100k 100. I don’t think this is true. But maybe he means sumpin else
For [5], we think of 100k/2
For [6],
Ok, So I did, 100k*1/2*5% = 2500$. If you reduce 2% for every year lived after being revived, we get for the first year 2500, next year 2401, and the next 2352 and so on, in the 146th year, we get 130 $. If you add this all up, you get 116k $. Close to 125k (maybe consudering an infinite series). For 276 years you get 122k.
So I conclude that he is right.
P.S I don’t think I understood him because of splitting but more by chance, as I didn’t ponder enough, what he meant and didn’t try out some guess calculations. If I look at what I tried to do I already seemed to hev a good idea about what was required. I just got confused with this discounting and how the years should be taken into account! Anyways, win for Agent!
Example:
Claim: [7] is 125k$
Example: It’s 200k at alcor now!
Persuation claims
Goal: Identify what exactly you are confused about; Goal is to satisfy the claim. What all can you not supply an example of?