Statistical basis of Differentiation and Geometric Mean Differentiation of an Interval acceleration.

(June 2013 note: This page has now been partly superseded by: The exponential function as geometric mean derivative of Fibonacci algebra.
Investigations into basic operations of geometric mean differentiation.)

As usual, apologies for this page as an individual's effort, without qualifications or check by the qualified. If that is not sufficiently discouraging, the purpose of this page is to apply Minkowski's Interval to my Statistical Differentiation technique (which has been improved since the first attempts of my web page: Statistical Differentiation and the geometric mean derivative).

Traditional Special Relativity is cast in terms of relating observations of observers moving in uniform relative motion, that is constant velocity in a straight line. Traditionly, relative acceleration between observers is a new departure into the General theory of Relativity. This abrupt discontinuity of the two Relativity theories is one of the bumps in theoretical physics, somewhat over-shadowed by other bumps or gaps such as between classical or Newtonian physics (whether or not this is held to include General Relativity) and quantum physics.

Nevertheless, a possible transition, from special to general relativity was the reason why I used my novel GM differentiation to derive a general relativistic acceleration reference frame from the Minkowski Interval, of Special Relativity, which deals in velocity frames.

Traditional differentiation as arithmetic and harmonic mean calculus.

To top

Since my first essay on Statistical Differentiation and a geometric mean derivative, I have thought of a very simple way of showing that traditional differentiation is, in effect, arithmetic mean differentiation or harmonic mean differentiation. It is a commonplace that first principles differentiation or derivation applies as well to small decreases in variables, as it does to small increases in variables. Normally, one would consider the effect of an increase in the independent variable on the dependent variable. Or one might consider a decrease.

One can also consider both an increase and a decrease, when thinking in statistical terms. Statistics is concerned with ranges of items. A range might cover positive and negative values, corresponding to either positive or negative changes in given variables subject to differentiation.

A distribution, which might range from negative to positive values, may have an average or representative value, somewhere mid-range, which statistics has several ways of calculating, in order to represent the nature of any given range.

A range formed of an arithmetic series of values has its average or mean calculated by an arithmetic mean. The average of a harmonic series is a harmonic mean.

The arithmetic mean is used of a sum of ratios, when their denominators are constant. Their shared denominator may be one, so they dont generally appear to be fractions.
The harmonic mean is used when the numerators, of the values to be averaged, share the same constant.

For example, two districts of voters, one with 100 voters, the other with 150 voters, have each been allocated five representatives. That is their allocation quotas (not elective quotas, which is a little different) are 100/5 = 20 and 150/5 = 30. The proper average to use for this example is the arithmetic mean: if you add two values, you divide the sum by two: (20 + 30)/2 = 25, for the average allocation quota for the two districts.

Suppose the two districts are both of 120 voters, and one district has been given 6 representatives, while the other has 4 representatives. In this case, the two allocation quotas will also be 20 and 30 votes: 120/6 = 20; 120/4 = 30. But this time, the numerator of the two ratios is constant, requiring the harmonic mean for their average.
This time the reciprocals of the two values, 1/20 plus 1/30, are divided by two:

(3 + 2)/(60 x 2) = 1/24. The reciprocal of 1/24, which is 24, gives the average, in this case, the harmonic mean of the two district's allocation quotas.

Suppose a function, like velocity equals distance traveled over time, or V = X/T. We could conventionly differentiate in terms of more velocity (+#V) or less velocity (-#V) as a function of more time (+#T) or less time (-#T), respectively.
(Note that the hash sign, #, for a change, usually a small change, in a variable, is usually given in maths texts by the Greek letter d, or delta.)

For instance:

V + #V = X/(T + #T).

Therefore:

#V = X/(T + #T) - X/T = -X.#T/T(T + #T).

Alternatively:

V - #V = X/(T - #T).

-#V = X.#T/T(T - #T).

In either case, this is a function in which the numerator is constant and the denominator variable, so the appropriate average involved is the harmonic mean: invert the two range values; add them and divide by two.

Hence:

2#T/#V = -T(T - #T)/X + -T(T + #T)/X = -T(T - #T + T + #T)/X.

Thus:

#T/#V = -T²/X.

Inverting for #V/#T = -X/T². This is actually the harmonic mean and also, as it happens, the traditional derivative dV/dT, but without having to take #T to a limit of zero, because following the harmonic mean procedure has already canceled out #T. However, differentiating with imaginary change-variables, say, i#V and i#T and averaging them with real change-variables would prevent canceling out #T. In that case, both #T and i#T would not disappear till the standard zero-limiting process of differentiation was performed. But the result would still be the same derivative.

And it doesnt matter how many items in the range. If there were three items to be added, this would involve summing the reciprocals and division by three to arrive at their harmonic mean, and so on. Also, as shown, it doesnt matter if items are positive or negative, implying positive and negative directions to each other. It doesnt even matter if the items are imaginary, such as i#V and i#T, which implies a direction at right angles. The same result as the traditional derivative is reached.

In other words, the above traditional derivative may be regarded as effectively a harmonic mean derivative.

Similar considerations apply for functions with variable numerator and constant denominator, to make them effectively arithmetic mean derivatives.

For example: y = x²/a, with variable, x, and constant, a.

Differentiating from first principles:

y + #y = (x + #x)²/a.

#y = (x + #x)²/a - x²/a = (#x² + 2#x.x)/a.

Therefore:

#y/#x = (#x + 2x)/a.

One possible alternative:

y + i#y = (x + i#x)²/a.

i#y = (x + i#x)²/a - x²/a = (-#x² + 2i#x.x)/a.

Therefore:

#y = -i(-#x² + 2i#x.x)/a = i#x² + 2#x.x)/a.

Or:

#y/#x = (i#x + 2x)/a.

Taking the arithmetic mean of the two results for #y/#x:

2#y/#x = (#x + 2x)/a + (i#x + 2x)/a.

Therefore:

#y/#x = (#x/2 + i#x/2 + 2x)/a.

Thus differentiating, by taking #x to a zero-limit, gives the same result by taking the arithmetic mean of two change-variables, as it does for one change-variable.
Namely, dy/dx = 2x/a.

Thus traditional differentiation has been one change-variable averaging. (In effect, the one change-variable is an average of itself.) That is harmonic mean differentiation or arithmetic mean differentiation, respectively depending on whether the ratio is variable in the denominator or numerator.

Classical calculus is implicitly statistical. It has a graphical interpretation, in which a segment of a curve is focused-on ever more closely, approaching a limit of zero length, it becomes indistinguishable from a straight line touching or tangential to it. The vertical and horizontal axes of this curve segment might be represented by small changes in distance and time, for example: V = #X/#T. The ultimate ratio of these values for the velocity, is the derivative, dX/dT, which is graphicly represented as the tangent to the curve.

This is not meant to be a proper explanation of the differential calculus.
There seems to be a link between a derivative as a tangent, which is a sloping straight line, and the graphic representation of an arithmetic series which is also a sloping straight line, representing a constant increase - or decrease - like velocity or a constant distance covered over some constant unit of time, like minutes. The steeper the slope of a graph, the greater the velocity.

First principles geometric mean differentiation.

To top

In my first page on statistical differentiation of a geometric mean derivative, my thinking was slightly different to the procedure followed here. There, starting with the form of traditional differentiation from first principles, I took certain steps, which would be redundant from the traditional point of view. Namely, differentiation was not only done using positive change-variables to the dependent and independent variables but the same exercise was repeated with the same result, this time using negative change variables. And then the two results were multiplied together giving the form of a geometric mean.

I recognised this geometric mean form in my elaboration of traditional differentiation from first principles, because it resembles a mathematical form in Special Relativity, namely the Fitzgerald-Lorentz contraction factor, and more generally, in the Interval. In fact, I came to realise that the Michelson-Morley calculation of average light beam journeys should have been with geometric means rather than arithmetic means.

An average or mean is a representative item or value out of a range of values. If the range involves a constant change (like velocity) from one value to the next, that is, if it involves an arithmetic series, then the appropriate average is the arithmetic mean. A geometric mean is of a geometric series like a change in velocity, which is acceleration.

Classical observations of moving objects do not allow for faster-than-light velocities. But the mathematical form of, say, the contraction factor suggests the possibility of a range of velocities from below light speed to a corresponding speed above light. Then, the geometric mean of that range will never exceed light, as corresponds to observation.

It was with this in mind that I first constructed geometric mean ranges from traditional differentiation from first principles, by repeating the process using first positive and then negative changes in the variables, multiplying them together to get a geometric mean form like that of, say, the contraction factor.

Only much later did I seriously consider the simpler possibility of statisticly interpreting just one change-variable in terms of a range, for example, a range from T to #T, rather than a range from -#T, about T, to +#T. This brought another problem of statistical interpretation of the simpler case. Two range limits, -#T and +#T, can give the geometric mean for this range by multiplying them together and then taking their square root. If there were three values in the range, the cube root would be taken.

With only one value in the range, say, #T, there is no other value to multiply by, and then a unitary root is taken, in other words, that is an identity operation for no change. (That is to say, equations 9 and 10, below, can still be implicitly interpreted as going thru the formal procedure of taking a geometric mean.) There is a geometric mean derivative when an infinite or unlimiting process is followed. This contrasts with a traditional derivative, actually in statistical terms, an arithmetic mean derivative, when the standard zero-limit procedure is followed.

However many values taken in a given range, one, two, three, etc, the outcome of differentiation is essentially the same function. This is true whether doing traditional differentiation, considered as arithmetic mean differentiation, or my innovation of geometric mean differentiation.

This may not mean that it is always redundant, in either AM or GM differentiation, to differentiate with more than one change variable (pair, to the dependent and independent variables). If one already has theoreticly perfect knowledge, it may make no difference, but lacking a perfect fit curve thru some given data, taking a large number of change variables, for many items in a distribution, may be necessary to find a curve that best fits a scatter of values distributed on a graph, as a result of measurements, which are always imperfect.

This page doesnt need to worry about the trials and errors of empirical findings, so the differentiation can be kept at its simplest, while not forgetting that differentiation is implicitly statistical.

My web page on Statistical Differentiation invented an extended version of traditional differentiation from first principles to derive the normal distribution. This can be applied to the Interval to derive it in terms of a normal distribution. From my point of view, this has the advantage of being a step by step process, or deduction, to arrive at a normal distribution that has a definite, and hopefully meaningful relation to Special (and possibly General) Relativity.

Normally, differentiation, takes the (independent) variable's change, say #T, to a limit of zero. The purpose is to get an ultimate result from making that change variable smaller and smaller. What I called geometric mean differentiation essentially takes the limit to infinity. It actually "unlimits." First, the equation, put in terms of the binomial theorem, is raised to the power of the change variable, which is then allowed to approach infinity.

By a standard text-book procedure, the binomial form transforms into exponential form: an equation, expressed in terms of the binomial theorem, enables a binomial series to be transformed into an exponential series, when it is raised to an infinite power.

Geometric mean differentiation applied to the Interval.

To top

The Interval, I, is the measure common to observers of an event seen from different points of view, when velocities approach light speed, c. The light speed is constant for all observers. This is an evidence-based principle of Einstein's Theory of Special Relativity. Two observers' local times are t and t', with local distance measures, x and x'. Different local measures come to the same thing, as shown in the Interval, equation 1:

I² = (ct)² - x² = (ct')² - x'²

Texts usually show three dimensions of distance or space, for the two observers: x, y, z and x', y', z'. Here it is assumed x is either just one dimension or a combination (vector) of two or three dimensions.

My introductory page showed Statistical Differentiation into the normal distribution. This random distribution was explained in detail. A normal distribution is summarily described in terms of its norm, the measure of central tendency. That is about where most of the items in a distribution congregate. A contrasting measure (such as the standard deviation) is of the distribution's spread. These two measures are said to sum-up the differences between normal distributions, as I seem to remember from my first lessons in elementary statistics, long ago.

Many years later but still quite long ago, I came across a third measure (parameter, I think is correct usage, and I recall there are still other lesser adjustments possible) which is of the hump of the bell-shaped normal curve. (There is a technical name for this but it just translates as humpedness.)
Finally, not so many years ago, I read a popular maths book that mentioned how the experts recently found a function that superseded the normal distribution, for some purposes, anyway.

Here, I have modified my technique in deriving the normal distribution in terms of the Interval. To do this, it helps to put the Interval into a quadratic equation, which is a novel form for it:

I² - 2ixct = (ct)² + (ix)² - 2ixct = (ct - ix)²,

where i = (-1)^1/2, which means the square root of minus one.

Hence equation 2:

I² = (ct - ix)² + 2ixct = (ct' - ix')² + 2ix'ct'.

The right side of equation 2 repeats the form with indices for a second observer. This is just to emfasise that the Interval still stands for a common measure by different local observers, tho departing from the conventional form of equation 1.

In fact, equation 1 equates to equation 2 so that one observer's measures could be put in the conventional form and the other observer's measures put in the new quadratic form like equation 3:

I² = (ct')² - x'² = (ct + ix)² - 2ixct.

It will be convenient to deal in the quadratic form of equation 2 for the following derivation, but it may be useful to remember that I² still can also mean equation 1 with another observer's different local space-time measures.

By the way, the term, 2ixct, has a geometric meaning, for example, in the Michelson-Morley experiment. This experiment measures a reflected light beam, split so as to go with and against earth motion, and to go, back and forth, at right angles to earth motion. Light takes the same time to return either way.
(See my web page, The Minkowski Interval predicts the Michelson-Morley experiment.)

The term, 2ixct, can be identified with the area covered by the light beam's journey at right angles to the Earth. (Indeed the symbol, i, can stand for an operator, meaning the operation turn thru ninety degrees. That is just as a minus sign, which equals i², means turning thru 180 degrees, or going in the opposite direction, as applies to the other part of the split light beam going back and forth with Earth motion.) The distances ct and ix are the distances covered by the light beam and the earth, at right angles to each other. Thus multiplying them gives an area. The number two doubles the area with two-way journey from the beam's reflection.

The next steps are to transform the Interval equation 2 into a form, such as could be reached by traditional (arithmetic mean) differentiation from first principles, as a starting point for geometric mean differentiation that arrives at a normal distribution. It is the eventual form of this distribution in mind, that is the cause of the following seemingly arbitrary manipulations of the equations.

If iut = vt = y, and -iut = -vt = -y, a negative sign may imply an opposite direction, and an imaginary sign, i, may imply a direction at right angles to a positive sign. The actual speed of u and v remains the same.
It is slightly simpler to work the quadratic form of the Interval, not in terms of: (ct + ix)² but of: (ct - ix)² = (ix - ct)². In this latter case, shown in table 1, below, y = ix.

The significance of this for an eventual equation of the normal distribution curve is that its independent variable can be considered in two full dimensions of x and y co-ordinates. The dependent variable, the frequency, thus builds up a profile of this bell-shaped curve, not just as a second dimension for a section of a bell but as a third dimension making the usual bell-shape in three dimensions.

Direction difference, rather than speed difference, is a consideration when interpreting the Michelson-Morley calculation in terms of the Interval. The conventional Interval form, as it stands, effectively gives the equation for the M-M experiment on a light beam moving with and against Earth motion. But the other part of the light beam, split to move back and forth at right angles to Earth motion, can be expressed simply by re-stating the Interval in imaginary terms (as explained in web page: The Minkowski Interval predicts the Michelson-Morley experiment).

It makes no difference whether the Interval is expressed as:

I² = (ct)² - x² = (ct)² + (ix)² = (ct)² + (-ix)².

The difference between +ix and -ix, in context of the M-M expt. seems merely to be that a light beam can travel cross-wise to earth motion, in two opposite ways, which are positive and negative to each other but make no difference to the result. (On my page, Null and non-null Michelson-Morley type experiments, I did have cause to use both ix and -ix.)

Re-arranging terms and dividing thru by 2ct(1-t), equation 2 leads to equation 4:

I²/2ct(1-t) = {(ct - vt)² + 2vtct}/2ct(1-t) = t²(c - v)²/2ct(1-t) + 2vtct/2ct(1-t).

Therefore 5:

I²/2ct(1-t) = t²(c - v)²/2ct(1-t) - vt/(1-t) = T - vt/(1-t),

where T = t²(c - v)²/2ct(1-t).

#T stands for a change in the independent variable, T. This change variable, #T, will disappear when the geometric mean differentiation is completed in the limiting process, or rather the unlimiting process: a difference between traditional differentiation and this novel version.

The Interval is being manipulated such that it is based on a differentiation from first principles of the simple equation 6a:

V = X/T. Or 6b: T = X/V.

Traditional differentiation considers how a small change in T, #T, affects a small change in V, #V, or vice versa in the case of eqn. 6b, which leads to eqn. 7:

T + #T = X/(V + #V).

Hence 8:

#T = X/(V + #V) - X/V = -X.#V/V(V + #V).

Hence 9:

#T/#V = -X./V(V + #V).

It so happens that my novel form of differentiation requires the following non-standard considerations of eqn. 9 transformed:

#T/#V = -X/V(V + #V) = -X/V.#V(1 + V/#V).

The right-side factor, when raised to an infinite power of #V, is known by a standard procedure to become (what is called) the exponent, e, to the power of V. The exponent, e, is an unlimited constant 2.718... etc.
The coefficient, -X/V.#V, must equal one, to remain one, rather than become indeterminate, when also raised to an infinite power.

Therefore, let #V = -X/V.

Since dependent variable, T, is a function of independent variable, V, this leaves X as a constant in the equation T = X/V. The Interval, I, is a constant, because it is the common space-time measurement to all observers of a special relativistic event. And the Interval also has the right dimension of distance, so it fits in with the ratio of distance over time being equal to velocity. Then let X = I. Thus, T = I/V.

Hence eqn. 9 becomes eqn. 10:

#T/#V = -I/V.#V(1 + V/#V) = 1/(1 + V/#V).

Traditional differentiation would take #V, in eqn. 9, to a limit of zero. My novel form of differentiation raises eqn. 10 to the power of #V, as in eqn. 11:

(#T/#V)^#V = 1/(1 + V/#V)^#V.

and takes it to infinity or unlimits #V.

Before so completing a geometric mean differentiation, it may be helpful and useful to show how, at first, #V might be a finite number. Then equation 11 merely requires expanding according to the binomial theorem. For convenience in doing this, equation 11 is inverted. Expanding the binomial factor into a binomial series, with a one in the factor, (1+V/#V) gives eqn. 12:

(#V/#T)^#V = (1 + V/#V)^#V =

1 + #V.V/#V + {#V(#V-1)/2!}(V/#V)² + {#V(#V-1)(#V-2)/3!}(V/#V)^3 + ...

= 1 + V + {(1-1/#V)/2!}V² + {(1-1/#V)(1-2/#V)/3!}V^3 + ...

As the change variable, #V, approaches infinity, a binomial series transforms into an exponential series, which sums in terms of the exponent, e, to the power of V. And eqn. 12 becomes eqn. 13:

(dV/dT)^dV= 1 + V + V²/2! + V^3/3! + ... = e^V = e^I/T = e^-T/I.

To show infinite differentiation, the hash sign is changed to a d-for-differentiation sign (following the example of traditional or limit differentiation).

Setting eqn. 13 equal to F (for "frequency"), eqn. 14:

F = (dV/dT)^dV = e^-T/I =

e^-t²(c - v)²/2ct(1-t)I = e^-t²(v - c)²/2ct(1-t)I .

Equation 14 is the infinite or unlimited derivative, in contrast to traditional derivatives using zero limits. It will prove convenient to change round (c - v)² to (v - c)², which does not alter the maths.

The infinite derivative is an inversion of T with respect to V. These variables, roughly speaking, are of the order of a change of velocity with respect to a change in time. Equation 14 therefore has the dimensions of an acceleration derivative, labeled: F for frequency, because the equation of the normal distribution gives non-uniformly varying frequencies. That means the normal curve shows an "acceleration" of frequencies.

The divisor, ct(1-t), was introduced because I wanted to express time as a probability, so that the Thermodynamics definition of time is made consistent with Relativity.

It's long been known what such as equation 14 means, tho it hasnt been treated in terms of a new kind of derivative. It is an equation of the varying height of the normal distribution, the bell-shaped curve from its flared edges thru its dome in between. The normal curve is symmetrical about its central dome to left and right. In probability theory, this symmetrical shape represents the varying outcomes with equal chances of success or failure of a repeated event. The curve can also be asymmetrical or skewed depending on how unequal the probabilities of success or failure.

A skewed distribution would send the dome, representing the bulk of events, to left or right, that is to greater than even probabilities of success or failure. If the curve is completely skewed one way or the other, the slope becomes like a mountain ski slope: very steep at first but increasingly gradual, tho never quite flattening out.
Such a decreasing rate of change is sometimes called exponential decay. It is still an exponential function (such as: e^-x).

The curve of exponential decay is decellerative or just the converse of exponential growth, which is accelerating growth, such as: e^x, where x is positive. For exponential growth, e^x, the area under the curve represents the sum of the exponential series. So, each point on the curve might represent a successive term in the series.

Suggested relation of traditional differentiation to geometric mean differentiation.

From the point of view of a statistical basis to differential calculus, the interesting thing about the exponential decay series is that each successive term is the inverse of successive derivatives from repeated differentiations by conventional calculus (which are harmonic mean differentiations, in my terms). Provided, that is, the series is divided by the function in question.

For example, take v = x/t and y = (t/x)e^-t.

Going back to the first section, the differentiation, of v = x/t, was characterised as implicitly a harmonic mean (HM) differentiation. Successive differentiation of v yields equations 15:

v = x/t; v' = -x/t²; v'' = 2x/t^3; v''' = -3!x/t^4 ....

The expansion of the exponential decay series is eqn. 16:

y = (t/x)e^-t = (t/x){1 - t + t²/2! - t^3/3! + t^4/4! - t^5/5! + ... }

= t/x - t²/x + t^3/2!x - t^4/3!x + t^5/4!x - t^6/5!x + ...

It will be seen that successive HM derivative equations 15 are just the inverses of the successive terms in a geometric mean differentiation, such as was carried-out to derive equation 14.

So the exponential decay series offers a simple relation between traditional differentiation and my new version, geometric mean differentiation as an exponential decay function, as a result of a change variable made to approach infinity, instead of zero.

In the first section, I explained that traditional differentiation implicitly consisted of both a Harmonic Mean differentiation and an Arithmetic Mean differentiation. Equation 15 showed successive HM differentiations. Successive AM differentiations, if they are given in reverse order, can be made to correspond to the inverses of successive HM differentiations, for example, transforming:

v''' = -3!x/t^4 into: V = v/v''' = x/tv''' = -t^3/3!

then, in equations 17:

V = -t^3/3!; V' = -t²/2; V'' = -t; V''' = -1.

When the order is reversed, these terms form the negative of the exponential series, -e^t. The signs could just as easily be made all positive, which would simply give the first few terms of the exponential series, that sums to e^t. However, equation 15 is the first few terms of an alternating series. If the inverses of all its terms were added (and all divided thru by t), the sum would be the inverse exponent, e^-t.

Successive AM differentiations sum to the exponential function but the order of differentiation is the reverse of the successive terms of the exponential series. In this sense, Arithmetic Mean differentiation is an anti-differentiation. In traditional calculus, anti-differentiation corresponds to what is called integration. To integrate is to join together to make whole. And the joining together or summing the terms of successive AM differentiations is to add terms of the exponential series towards its sum in the exponential function, such as e^t.

The graph of the exponential series, term by term, joins together, slice by slice, the area under the exponential curve. But those terms can just as well be seen as successive AM differentiations. The inverse exponential series, e^-t, sometimes called exponential decay, graphs in reverse to the exponential series, e^t. Successive HM differentiations need their ratios inverted to match the terms of exponential decay. So, they only indirectly represent successive slices of the area under its curve.

Traditional calculus doesnt make the distinction between AM differentiation and HM differentiation. The index order of successive AM derivatives decreases, while that of HM derivatives increases. Apart from this reverse effect of their differentiations, they could only correspond, given ratio-inversion and sign-alternation of the AM derivative terms.

There are different kinds of exponential function, notably the normal distribution, essentially a random distribution, the bell-shaped curve, with its widely varying gradients, measured tangentially to the curve. Tangents graphicly represent derivatives.

I am not a mathematician nor do I attempt rigorous proofs. I am just an old and out-dated amateur explorer. But it seems to me that the above basic relation is a good candidate for a so-called fundamental theorem of statistical differentiation, relating traditional differentiation (as arithmetic mean differentiation) to my innovation of geometric mean differentiation.

It doesnt matter that this basic relation, of AM differentiation to GM differentiation, is disguised for exponential functions other than the most basic form, say: e^T. The independent variable, T, can still stand for a more complicated variable, such as the normal distribution curve, in eqn. 14.

The sum of successive heights, calculable by equation 14, gives an approximation to the area below the normal curve. The sum of the successive terms of the exponential series gives an approximation to the area under its graphed curve. Most simply, this area can be considered as the exponential series of terms: 1 + 1/1 + 1/2! + 1/3! + 1/4! + ... This infinite series converges, in its first few terms, to a good approximation of the constant called the exponent, e = 2.718... etc. That is: e^x where x = 1.

In traditional calculus, the basic relation or "fundamental theorem of calculus" is that the reverse of differentiation or anti-differentiation is effectively integration, so-called, usually the exact calculating of a given area under a curve, using a limiting process.
In traditional calculus, an exponential function, of the form, y = e^x, is the particular case that remains the same under differentiation, and, therefore, under reverse differentiation. This remains essentially true of less simple exponential functions, in that the derivative will still have the exponent, tho any other modifications, such as coefficients, will be transformed such as to reduce the magnitude of successive derivatives.

Generally, successive derivatives are reductions, if the function is continuously differentiable. Independent variables in numerators generally are reduced an order of magnitude at a time till they reach zero. Independent variables in denominators may increase by an order at a time but that only makes a smaller fraction. They are, after all, the result of a repeated zero-limiting procedure.

A new kind of derivative, a geometric mean derivative actually can create an exponential function (including the circular function kind, explained below), by applying an infinite limiting procedure. It will do this not only for a GM derivative with the dimensions of acceleration but also for a GM derivative with dimensions of velocity. This seems puzzling because velocity is only a constant change. It graphs as a sloping straight line, unlike an exponential curve.

One thinks of classic geometry where parallel lines never meet. Whereas in the geometry of curvature, at a sphere's equator, parallel lines of longitude meet at the poles. An object can move in a circle at constant velocity but velocity, as distinct from speed, is a vector, which has direction as well as magnitude. Circular motion changes direction of the velocity. Tho there is no tangential acceleration to the circular motion, there is still centripetal acceleration: like a satellite "falling", under gravitational acceleration, into orbit at a constant speed round a planet.

Circular functions are more simply expressed, than with trigonometry, by exponential functions with imaginary indices. The most famous of these Euler formulas is: e^iπ +1 = 0. This equation is of the two most important or basic numbers, the two most important constants and the two most important operations in mathematics.

What this mystical formula actually means is: e^1iπ = -1,0. Where -1 and 0 are respective x and y co-ordinates of (half way round) a circle of radius one unit.
A coefficient of zero in the index would give the co-ordinates, 1,0, which would be repeated with the index coefficient, two, corresponding to the operation of going once round a complete circle. Thus, number coefficients, including fractions, in the index, can pin-point positions on the circumference.

Interpretation of probabilistic special relativity.

To top

The above section uses my new technique of geometric mean differentiation to derive a normal distribution of probabilities from the Interval of Special Relativity. This makes for possible probabilistic interpretations of this theory, usually supposed to be strictly deterministic. But then, the calculus is supposed to be deterministic, when (I claim) it is implicitly statistical.

From eqn. 14, the normal distribution is:

F = e^-V = e^-T/I =e^-(vt-ct)²/2ct(1-t)I

= e^-c²(vt/c-t)²/2ct(1-t)I.

Compare with the following standard form of the independent variable in the normal distribution, as follows, in equation 18:

F = e^-(m-np)²/2npq = e^-n²{(m/n)-p}²/2npq.

Here, the notation refers to the general mathematical meaning of the distribution, rather than specific meanings of special relativity. In both cases, the basic form of the equation is given by F = e^-V, where F stands for frequency.

The term, q, is a complementary probability of failure, to the probability of success, p. Usually, p + q = 1, where unity refers to the whole sum of more or less probable possibilities. When the probability, p = q = 1/2, there is an equal probability of success or failure. If tossing a coin for heads is conventionly deemed a success, and tails, a failure (say, if there is a bet for the coin to land heads), it is most likely that for half the coin throws, the coin will land heads, and the other half coin throws, the coin lands tails.

How many times, the coin is thrown, is called the size of the sample (number of throws). The normal distribution is not accurate for small numbers. But for simplicity of explanation, say a coin is thrown four times, then sample size n = 4. There are a number of distinct logical possibilities for a given sample.

All the possibilities for heads, H, and tails, T (not to be confused with T for a function of time in eqn. 14), are given by the binomial expansion of: (H + T)^n = (H + T)^4. The binomial theorem gives, in terms of algebra, the form of Pascal's triangle, whose row four lays out the probable distributions according to chance: One case of four heads, one corresponding case of four tails; four cases of three heads and one tail and four corresponding cases of three tails and one head; finally, and most probably, or the norm of the normal distribution, six cases of two heads and two tails.
These cases: 1; 4; 6; 4; 1, recognisable as row four of Pascal's triangle, are the frequencies, F, which the equation of the normal distribution solves (when applied to larger samples than this, but that is the principle of it).

The average number of successes is the probability, p, multiplied by the sample size, n. Thus the average, np, is one half of four, which equals two. That is two heads is the normal outcome in 6 out of a total of 16 trials, given that there are sixteen trials of tossing a coin four times, and noting which way the coin falls in each sample of four coin-tossings.

The variable, m, stands for the range of possible successes from zero heads to four heads. When m = np, the subtraction m - np = 0, which would yeild the value for the normal frequency (the maximum of 6 in above example).

A standard form of the distribution sets np, the average value of m, which graphs on the x-axis directly below the norm, at zero, so that more or less than the average number of successes are expressed in terms of plus or minus the average. In this example, it would be plus or minus one or two heads (above or below the average of two heads, convention resets to zero).

Whether or not this re-setting convention is adopted, the point is that the normal distribution is an equation, which tells the frequency of any given possibility in the range, m (in this example, 0, 1, 2, 3 or 4 heads).

Finally, note that there is no corresponding term in eqn. 18 for term I, the Interval, in eqn. 14. That means that the Interval must be set at unity for the two equations of the normal distribution to match. This is not too big a problem because the Interval is constant and so may be set at a constant number like one.

Having described how the normal distribution equation works, what does it mean in terms of special relativity, as derived above?
Table 1 shows equivalent variables.

Table 1: Normal distribution in special relativistic terms.
Statistical term statistical term's meaning Special Relativity term SR term's meaning
p probability of success t time
q = 1 - p probability of failure q = 1 - t
n sample size c light speed
m range of possible results from 0 to n ix = y distance
F frequency for any given m F frequency

Table 1 is just to show the equivalence of terms from a mathematical point of view. It is unenlightening until it is used to convey a statistical meaning to special relativity.

The normal distribution graphs as a bell-shaped curve. The central dome of the bell is the norm, where the greatest height of the bell is. This pictures the greatest frequency or the most times an event occurs, as when the most probable event is two heads and two tails, for each trial of when a coin is tossed four times, or, to put it another way, each time four coins are tossed.

The flares at each end of the bell, at its least height, represent the least likely or least frequent occurences, when, in this case, either four heads or four tails are thrown in a single trial. Of course, small samples, like this, are a poor approximation of the bell curve. The smoothness of the bell graph represents so many discrete occurences that the differences in frequencies, between them, look to be part of a smooth transition.

Thus, a graph of the normal distribution has, for its x-axis, the range, m, of possible occurences from zero to four or whatever. The y-axis gives the frequency of each of the possible occurences. The curve is a symmetrical bell shape when the probabilities of success and failure are also symmetrical or p = q = 1/2. When they are not, the curve of frequencies will be skewed either way, according to the greater the probability of success or failure.

It makes sense that time is defined in terms of probability, which is how thermo-dynamics defines time observationly. The second law about the running down of the universe says that, on the whole, there is a tendency to disorder or entropy, which we perceive as the passage of time. Our aging bodies mark the passage of the decades. Tho, good treatment may defy the apparent effects of time.

Light speed, c, as equivalent to sample size, n, makes sense in terms of my page about a caldera cosmic model, where light speed was the norm about speeds varying from zero to twice light speed.

Suppose that observed standard light speed is the average speed of a normal distribution. One half of the distribution covers (tardyonic) speeds up to the average or up to light speed. The other half of the distribution covers faster-than-light or tachyonic speeds. The mathematical form of the normal distribution, like the mathematical forms of special relativity, allow velocity, v, to stand for any speed tachyonic or tardyonic, tho only speeds up to light speed itself are classicly observed.

An electoral analogy is in a random distribution of seats per constituency in a multi-member system. Say there are sixteen constituencies with an average of three seats per constituency and a maximum of five seats per constituency. Then the sample size, n, most simply would be four districts, in which any of the logical possibilities of seats per constituency occurs. Maximum light speed is equivalent to a maximum "velocity", so to speak, of four seats per constituency, in the system.

Assume a symmetrical distribution of seats from constituencies made up roughly equally of rural and urban districts, with no seats in one constituency that randomly occurs all of rural districts. Then, the probability, of finding rural or urban districts, is a half, and the average number of seats per constituency is the probability times the sample size, or np = 4/2 = 2 seats per constituency. Out of 16 constituencies, the binomial distribution or Pascal's triangle gives 6 constituencies with this average number of seats.

Thus the observed speed of light is equivalent, in this electoral analogy, with the average seats per constituency.

Deriving a normal distribution from the Interval of special relativity almost forces quantum physical considerations of varying light speed.

And note that the basic equation of quantum theory, the Schrodinger equation, is of the mathematical form of a partial differential equation called a diffusion equation or heat conduction equation. (This is despite the confusion from refering to the Schrodinger wave equation, when it is not of the form of partial differential equation called the wave equation.)

Comparing the diffusion equation to its finite difference equation version, think of Pascal's triangle. The rows are generated one by one, from 1 to 11, to 121, to 1331, etc. If you think of that generation happening over time, one row at a time, then the expanding rows represent an algorithm for distributions in diffusion.

The diffusion equation is of the diffusion or spreading over time to a distribution in space. The equation solves for the probability of finding, say, some particle in a given region according to random probability theory.
Quantum probability has certain revolutionary divergences from classical probability.
Also, the Schrodinger equation is non-relativistic. Nevertheless, having framed relativity in terms of the same basic equation form, the diffusion equation, it is tempting to believe that some sort of reconciliation between quantum theory and relativity is possible thru a new geometric mean differentiation of special into general relativity that solves as a potentially diffusing random distribution.

On a sequel page (Random choice models for a GM derivative of a Thermo-dynamic Relativity) I couch my geometric mean derivative, of general from special relativity, in terms of a thermo-dynamic probabilistic definition of time.
The suggestion is that all three branches of physics, relativity, thermo-dynamics and quantum theory might be related thru this geometric mean derivative.

How might the normal curve, considered as an acceleration of frequencies, bridge the gap from special to general relativity?

Perhaps, equation 14 can be considered as the Interval in acceleration form, where accelerating motion is the province of the theory of general relativity.

The Interval in special relativity measures a given (velocity) event in the same space-time terms for all observers. Tho, observers have their own local space and time measures. The Interval is still contained in a statistical derivative of its quadratic form, which gives a normal distribution in the dimensions of acceleration. Therefore, different local observations of space and time can be accommodated not only to a commonly observed velocity event but now also a commonly observed acceleration event.

Crudely speaking (and that is all I am able to do for general relativity) Einstein's principle of equivalence is of two frames of reference, one at rest and the other accelerating. This is the famous mental experiment of observers inside and outside of an accelerating lift.

To express a rest frame, in equation 14, then velocity is zero or v = 0. Essentially, the way the normal distribution is calculated, in this example, the velocity, v, is subtracted from the light speed, c. (It doesnt matter that the subtraction is negative because the negative sign is removed when the factor is squared.) To cut a long story short, when all the other variables in the independent variable have been accounted-for, the value for the dependent variable, F is reached.

On a graph, the frequency, F, is the vertical height of the bell-shaped normal curve at any point along its length. The velocity, v, is measured along the horizontal axis or x-axis, with light velocity, c, in the middle, directly below the norm or apex or mode of the normal curve. The norm measures the highest frequency value of variable F.

Rest or zero velocity measures the frequency at its lowest point. In theory, the normal curve is an exponential function, so that the flares of the bell curve continue indefinitely. The slope of the flares gets ever more gradual but never quite flattens out nor ever quite reaches the x-axis.

In effect, the frequency for zero velocity is on a stretch of curve that is indistinguishable from a straight line, whose slope is also too shallow to measure for a velocity greater than zero. That is to say the graphical representation of rest. Likewise, the frequency, which the statistical derivative identifies as having acceleration dimensions, measured on the vertical axis or y-axis, is too low to be distinguished from zero. Hence, acceleration practicly zero.

Thus, the normal curve pictures an event that has different acceleration frequencies, from practicly rest or zero velocity frame, to the highest frequency for light speed reference frame.

The Interval of special relativity can represent an event as occuring at a point in four-dimensional space-time. But the Interval, statisticly differentiated into terms of a normal curve, represents an event as that normal curve, with different observers' measures making-up possible points, in terms of acceleration frequencies, all along that curve.
This seems to suggest that the whole curve represents a (general relativistic type) acceleration event seen from a range of possible observations.

Richard Lung.
9 july 2010.
Corrections: 20 july; 24 august; 8 & 23 september; 19 & 22 october 2010. 24 february 2011.
My apologies for the running corrections.
Further comments: 20 april 2011.

To top

To home page