Understanding Algebra: Why do we factor equations?

kalid · on Aug 31, 2012

Hi all! Author here. Didn't expect to see this on HN so early this morning :).

Lots of good feedback, I'll try to answer some of it collectively here.

HN readers who can spit out the quadratic formula in their sleep aren't the real audience for this article :). It's someone googling "Why are we factoring all the time?".

A high-level question needs a high-level answer before diving in. I'm not a fan of "Because that's where the graph touches the x-axis" because it again leads to... Why does touching the x-axis matter?

My intuition is that we have a system and we have something we want it to become. The trick is to track the difference as its own system, write it as interlocking parts, and break any of the parts.

"How do we break a chain?" => we break any of the links. Intuitively, that's what we're enabling when we factor an equation into a series of multiplications. I'm probably going to continue to reword the explanation, but that is fundamentally why we bother rewriting as a series of multiplications.

On rigor and simplicity: guilty as charged. I believe you need to become intuitively comfortable with an idea, even if slightly incorrect, before getting into the nuance.

We tell kids a cat is a furry animal that has claws and a tail. Later, we refine to say they're all descendants of a common ancestor. Later we say there's this thing called DNA which holds genetic information, and all cats have DNA in common.

See this article for more:

http://betterexplained.com/articles/developing-your-intuitio...

I go into the 4 rigorous and common definitions of e, something which confused me (and many, many engineering students) for years. But approaching with intuition it all snaps together.

Again, that's my teaching style though! It's been very successful for me and other students. Rigor is always available on Wikipedia and Mathworld if you need it.

klochner · on Aug 31, 2012

I applaud your effort to describe why we factor equations, I just don't think you've done that here.

   A high-level question needs a high-level answer before
   diving in. I'm not a fan of "Because that's where the
   graph touches the x-axis" because it again leads to...
   Why does touching the x-axis matter?

Because that's where the the system "becomes what you want it to become", i.e, where:

  x^2 + x - 6 = 0 
      x^2 + x = 6

You're trying to describe factoring to someone without the basic fundamentals to even need to do factoring.

By analogy, that's like saying "I want to explain why we need piston rods, but I don't want to mention anything about engines."

Prerequisites are meaningful.

[Edit] I just verified that graphing is typically taught after factoring, so I at least sympathize with the challenge you're facing.

kalid · on Aug 31, 2012

Thanks for the feedback! Yep, graphing usually comes much later, so I'm trying to find an explanation that works without it.

For the piston rods, I'd say something like "They help capture the power of an explosion. And an engine is a really, really cool device to make this power useful. Want to see how it works?" :)

csense · on Aug 31, 2012

The way I'd explain this is the reverse of the way you're going about it. You're starting with a complicated equation and simplifying it.

I'd start with the simplified equations and make them gradually more complicated, showing carefully at each stage how to get back to the simpler case.

0. We can solve linear equations.

1. If you have a factored quadratic polynomial equal to zero, you can find its solutions by solving F_i = 0 for each factor F_i (which we can do, by 0, since F_i is linear). A solution to either of those linear equations must be a solution to the original equation by zero product property.

2. If you have a quadratic expression equal to zero, you can find the solutions by factoring, then applying (1).

3. If you have an equation L = R with quadratic expressions on both sides, you can find its solution by subtracting R from both sides to get L-R = 0, then applying (2).

Both rigorous and simple, and also introduces one of the standard design patterns of higher mathematics: Find a solution to a simpler problem, then see what kind of more-complicated problems you can solve by reduction to the simpler case.

batista · on Aug 31, 2012

I'm a fan, I've even bought the "Math, better explained" ebook in the past, but this article, while nice wasn't on par with the material there.

Not to be scrapped though, just needs some rewording and some more meat.

kalid · on Aug 31, 2012

Thanks! Yep, this was one of the faster ones I've written. I'm doing a few experiments to see how to increase my article output :).

psykotic · on Aug 31, 2012

The bigger picture is that factorization is decomposition. Depending on the domain, that can mean different things.

In geometry, factoring a polynomial in x and y decomposes its curve into a union of subcurves. The curve of xy = 0 is the union of the curves of x = 0 (vertical line through origin) and y = 0 (horizontal line through origin), so the curve of xy = 0 is a pair of crossing lines with a double point at the origin. Similarly, a complete factorization of f(x) decomposes the intersection of y = f(x) and y = 0 into a union of points (the linear factors). But what about irreducible quadratics like x^2 + 1? From a geometrical perspective, they're an algebraic nuisance that goes away with complex numbers.

In linear time-invariant systems, a factorization of a transfer function corresponds to a serial decomposition of the system because of the convolution theorem. The most interesting case is when the system is recursive, so that the transfer function is a general rational function (a ratio of polynomials) and you have feedback loops in the block diagram. A parallel decomposition corresponds to a partial fraction expansion, which is based on division and factorization. These decompositions can also be mixed, so you can take a transfer function that is a ratio of sextics, factor them into a pair of cubic ratios (serial decomposition) and then break each of those halves into partial fractions (parallel decomposition).

In calculus, the partial fraction expansion lets you integrate all rational functions once you know how to integrate functions of the form x^n for any integer n, positive or negative.

I could give more examples, in probability theory (characteristic functions and moment-generating functions), in combinatorics (generating functions), in the theory of linear differential equations (the Laplace transform diagonalizes shift-invariant operators like differentiation, so differential equations become algebraic equations), etc.

grandalf · on Aug 31, 2012

Are there any interesting/useful engineering applications of the recursive ones you mention?

psykotic · on Aug 31, 2012

Tons. The whole theory of IIR filter design is predicated on understanding the effects of the transfer's function poles and zeroes (generally complex numbers) on the behavior of the filter. Pick up any book on signals and systems like Oppenheim to get the details. Julius Smith's online books are also very good albeit terse.

grandalf · on Aug 31, 2012

Thanks! That is actually an area of interest. Thanks for the recs.

psykotic · on Aug 31, 2012

Here's the relevant chapter in one of Julius Smith's books:

https://ccrma.stanford.edu/~jos/filters/Transfer_Function_An...

It might not make total sense if you don't have any background in this area, but it's hopefully enough to give you an inkling.

grandalf · on Aug 31, 2012

Thanks! I am actually learning the background theory at present so will bookmark the link.

yequalsx · on Aug 31, 2012

I've seen a number of criticisms of the article. Some things need to be kept in mind.

Presently it is standard to introduce factoring before one knows how to graph polynomial functions. It is possible to argue that we teach this before factoring and I think a compelling case can be made for this but it isn't a standard practice. Anyone who thinks of using x-intercepts to explain this probably has had some College Algebra. The topic in the article is usually covered in Beginning Algebra.

References to the discriminant in criticizing the article are not appropriate. The discriminant is taught after teaching factoring.

The normal order of topics is

1. factoring 2. solving quadratic equations with factoring 3. proving the quadratic formula 4. solving general quadratic equations

czzarr · on Aug 31, 2012

Are you serious about college algebra? in France this is taught in middle school

sp332 · on Aug 31, 2012

To be clear: yequalsx said Anyone who thinks of using x-intercepts to explain this probably has had some College Algebra. The topic in the article is usually covered in Beginning Algebra. So x-intercepts are taught in middle school, but the idea of using them to explain this particular problem probably wouldn't occur to someone with such little experience.

yequalsx · on Aug 31, 2012

Indeed, college algebra students typically struggle when trying to tie in the concepts of factoring, x-intercepts, remainder theorem, and zeros of a polynomial. Lots of gnashing of teeth in this chapter.

csense · on Aug 31, 2012

"College algebra" is usually taken by people who haven't had algebra, geometry and trigonometry classes in middle or high school. If you had all those classes, many colleges allow you to enroll in calculus as a freshman (sometimes you have to take a placement test), or you can even skip some of the calculus sequence with AP credit.

College algebra isn't something that a lot of math, science and engineering majors take, especially if they're toward the top of the class and haven't taken the easy road in their high school course selections.

louischatriot · on Aug 31, 2012

I would say that the discriminant method is taught at the same time as factoring, when we prove why this "b^2-4ac" is so interesting.

yequalsx · on Aug 31, 2012

You may say this but it is wrong to believe this. The discriminant is not taught when factoring is taught. Indeed, any quadratic can be factored over the complex numbers but this is beyond the students' mathematical prowess at the time we teach factoring. It's why we say to beginning algebra students that x^2 + 1 is prime but we show college algebra students that it can be factored over the complex numbers.

louischatriot · on Aug 31, 2012

To clarify my comment above, I was talking about factorising 2nd order polynoms, which is taught at the same time as discriminant in France (I don't know about the way it is done in other countries). This makes sense as they are basically the same thing.

What do you mean x^2+1 is prime ? For me prime only applies to integers or do I miss another way this word is used ?

yequalsx · on Aug 31, 2012

Prime applies to objects in any commutative algebraic system that can't be further decomposed. I don't want to get too technical so I'm speaking in very loose terms. x^2+1 is prime over the integers but not over the complex numbers.

I don't know the French education system but it would be very shocking to me if the discriminant is taught when factoring is first taught. Factoring x^3 - y^3 does not have anything to do with the discriminant and x^3 - y^3 is typically taught when factoring is first introduced. At least in the U.S.

R_Edward · on Aug 31, 2012

x^2 + 1 is prime over the integers? Maybe they've changed math since I was in high school, but I'm pretty sure that, given n, such that n is an integer, if x = (2n + 1), then x^2 + 1 = 4n^2 + 4n + 2, which equals 2(2n^2 + 2n + 1), which is clearly not* prime.

csense · on Aug 31, 2012

When speaking of factoring polynomials, any constant factor is considered a unit, which doesn't count toward whether a polynomial is prime (irreducible) or not.

In the integers, 1 and -1 are units. So 7 is prime even though it can be "factored" as 7 = 7(-1)(-1) or 7 = (7)(1). Likewise, 4n^2 + 4n + 2 is prime, even though it can be "factored" as (2)(2n^2 + 2n + 1).

Also, your method of substituting a variable is suspect. Being irreducible is a property of a particular polynomial; if you substitute a different expression for x and simplify, you have a different polynomial. Your argument is like saying, "7 is prime, but if you add three to it, it becomes 10 which isn't prime, how can this be?" -- you did something to the number 7 which turned it into a different number, so it should be no surprise that its properties may have changed in the process.

If you want to know about polynomial factorization over various fields, any good abstract algebra textbook should discuss the topic at great length.

R_Edward · on Sept 12, 2012

Unlikely you'll see this, so long after the fact, but oh well. My "method of substituting a variable" is nothing more than a more mathematically rigorous way of observing that x^2 + 1 is not prime for any of the odd integers, which is roughly half of them. Of course, that's using a definition of "prime" that is at odds with yours, and I'll defer to your more advanced math knowledge in that regard.

lacker · on Aug 31, 2012

This seems like a confusing explanation to me.

We factor equations because that makes it easy to figure out what x is. If you did not know any algebra, and looked at an equation like

Ax^2 = Bx + C

it is not obvious what x can be. But if you factor it into

(x - D)(x - E) = 0

then if you know how zero multiplies, it is fairly clear that x must be D or E.

jpeterson · on Aug 31, 2012

Your explanation doesn't address where the zero came from in the second step. In other words, you're presupposing that we already know what "factoring" is, which is what this article is meant to explain.

fusiongyro · on Aug 31, 2012

If adding teepees adds clarity I certainly have no business teaching anybody any math.

wging · on Aug 31, 2012

>There’s formulas for more complex systems (with x^3, x^4, or x^5 components) but they start to get a bit crazy.

This is just untrue! It can be proven that there's no general formula that can work for x^5 without just defining away the problem by introducing a solution to a polynomial not solvable by radicals.

wcarey · on Aug 31, 2012

Niels Abel of abelian group fame (http://en.wikipedia.org/wiki/Niels_Henrik_Abel) proved this for x^5.

kalid · on Aug 31, 2012

True -- I got overzealous with x^5, some (not all) can be factored. I'll reword to "some more complex systems.

psykotic · on Aug 31, 2012

> I got overzealous with x^5, some (not all) can be factored

All polynomials can be factored but only some of the fifth or higher degree are solvable by radicals--a crucial distinction.

Here's a fun puzzle that also serves as a gentle introduction to the central idea of Galois theory: Show that any palindromic quintic is solvable by radicals. A palindromic quintic is one of the form ax^5 + bx^4 + cx^3 + cx^2 + bx + a. It might be useful to know that if you have a polynomial equation f(z) = 0 and make the substitution z -> 1/z and multiply through by z^n, you get the reversed polynomial. On the Riemann sphere, this inversion symmetry reflects the northern hemisphere onto the southern hemisphere and vice versa, interchanging the north and south pole, which are 0 and infinity in z coordinates. Therefore a palindromic polynomial is one that "looks the same" from either vantage point.

This idea of looking at the behavior of a polynomial simultaneously from the vantage point of 0 and infinity is also the basis of how the fundamental theorem of algebra is proved. The general model is 1 + z^n. Once you understand that polynomial qualitatively near 0 and infinity, you can understand every other polynomial by a simple perturbation analysis.

lotharbot · on Aug 31, 2012

When solving high-school algebra problems, we try to reduce them to arithmetic. When solving calculus problems, we try to reduce them to algebra. When solving differential equations, we try to reduce them to integrals. In essence, we use some new technique in order to take a problem we don't know how to solve and transform it into a problem we do know how to solve.

That's why we factor: because we can (sometimes) transform a higher-order equation into a series of linear pieces, and solving a linear equation is essentially arithmetic.

wcarey · on Aug 31, 2012

Who's the target audience for that explanation? It seems like the initial question can be solved much more easily by rewriting x^2 + x = 6 as (x)(x+1) = 6. Seeing that x can be two seems intuitive, and that x can be -3 a bit less so.

I wonder whether talking about algebra as the manipulation of statements to construct consistent sets is more productive than talking about error in systems?

Instead of x hiding a value, we want to know what statements can we make about x that are consistent with x^2 + x = 6.

pavel_lishin · on Aug 31, 2012

> seems intuitive

You can't really write "well, it seems intuitive" into a logical proof, or into an algorithm.

alanctgardner · on Aug 31, 2012

But the point of this site is seemingly "understanding", not an algorithm - see the title of the page ("right, not rote"). However, what they really seem to be doing is breaking down what should be an intuitive process - understanding equalities and how to work with them - into several smaller, vague sub-processes which also aren't really easy to understand. I would argue that explanations like this might hurt someone's chances of understanding concepts like systems of equations later on.

pavel_lishin · on Aug 31, 2012

I thought it was meant to give us an intuitive explanation for why we approach things via the "factoring algorithm" as a method of solving these guys.

In any case, it's just a simple example case. x^2 + x = 6 can probably be solved just by staring at it for a little bit, but once the equation gets more complex, trying out a couple of numbers in the [-5,5] range is not gonna work anymore.

ChuckMcM · on Aug 31, 2012

Good point, you write it "by inspection" :-)

louischatriot · on Aug 31, 2012

your factoring, although correct, is not really useful in the general case. For example with x^2+x=1.

wcarey · on Aug 31, 2012

To be sure. But it is useful in this particular case, so why wouldn't we use it? Similarly, you can write every multiplication or natural numbers as a summation, but we don't because multiplication notation adequately handles the case where the term of summation doesn't change and is simpler to reason about. The motivation for summation is that sometimes we want to add up things where multiplication notation fails. Similarly, the motivation for factoring is that sometimes we run into an equation that we can't intuitively reason about. His explanation would be more persuasive to me with an example where simple inspection fails.

If the target audience for the article is people who understand factoring and are amused by thinking about it in different ways (the bit about systemic error was new to me and intriguing!), then I think it succeeds. If the target audience for the article is people who don't yet understand how to factor, but do understand why to factor, then I think it sort-of succeeds. There are probably people (the author is certainly one?) for whom his explanation will seem intuitive and compelling. If the target audience is people who neither know how to factor or why, then I think it's more problematic.

It's symptomatic of a rush to abstract and generalize that often leaves students wondering why the abstractions are more valuable than more intuitive reasoning. An example that requires factoring would speak to that motivation a bit better, I suspect.

alanctgardner · on Aug 31, 2012

Looking at some other explanations from this site, it seems like it suffers from the same things people complained about with Khan: a lack of rigour, and oversimplification. They have some good visuals and metaphors, but ultimately the 'plain' explanations of mathematical properties are a poor foundation for someone to build upon when they get to higher concepts.

bsaul · on Aug 31, 2012

I don't understand why adding error in "factoring the error" helps in any way understand why we factor. It would have started with a factorized form with 0 on the right, saying that finding that a*b=0 is easy to solve because of the way 0 and multiply work together.

maxerickson · on Aug 31, 2012

(Apparently), people don't always understand that ax^2+bx=c is equivalent to ax^2+bx-c=0.

So they don't see how factoring the latter reveals anything interesting about the former.

kalid · on Aug 31, 2012

Exactly! In the real world you are trying to get some system (x^2 + x) into some desired state (2).

In math class, we usually "pre-optimize" these equations to (x^2 + x - 2 = 0) and they look very different from the situations you have to set up on your own. I want to make the connection explicit.

czzarr · on Aug 31, 2012

thank god my math teacher didn't teach like that. This is the most confusing explanation of factoring I have ever seen.

for the apparently numerous people here who have no idea how to solve this equation generically, read this: http://en.wikipedia.org/wiki/Discriminant or watch this: http://www.khanacademy.org/math/algebra/quadtratics/v/discri...

louischatriot · on Aug 31, 2012

I find this explanation overly complicated. For the general case a*x^2+bx+c, the easiest way is to try to reduce it to the form A^2-B^2 which has the easy factorisation (A+B)(A-B). So:

ax^2+bx+c = a[x^2+(b/a)x+(c/a)] = a[x^2+(b/a)x+(b/2a)^2-(b/2a)^2+c/a] (only adding and removing the same thing) =a[(x+(b/2a))^2 - ((b^2/4a)-c/a)] (using (A+B)^2=A^2+B^2+2AB)

Call D the term (b^2/4a)-c/a, if D is positive we can take its square root, factor and have our two solutions.

I like more analytical solutions.

bazzargh · on Aug 31, 2012

Wow, that's a terrible explanation. Try drawing a picture instead. The factorization simply answers the question of where the graph touches the x-axis.

tomrod · on Aug 31, 2012

I think the explanation is thorough. Is it intended to be a full and complete explanation, or an introduction? As an introduction it'd be very poor--it would probably be too encyclopedic as an introduction.

eckyptang · on Aug 31, 2012

Another thing to add to this: Factored equations can be computationally cheaper to execute.

psykotic · on Aug 31, 2012

Not if you use Horner's method. Evaluating a(x-p)(x-q) takes two additions and two multiplications. Evaluating ax^2 + bx + c with Horner's method as (ax + b)x + c takes two additions and two multiplications as well. Factoring a polynomial to save on evaluation is as inefficient and roundabout as factoring a pair of integers to help compute their greatest common divisor. You're replacing an easy problem with a much harder one.

eckyptang · on Aug 31, 2012

That's a fair point. Perhaps my example was badly chosen!

ten_fingers · on Aug 31, 2012

The answer is simple, dirt simple: If you have to ask the question, then DON'T!

That is, we factor algebraic expressions if and only if (iff) we have a good reason to do so. If we don't have a good reason to factor, then there is no need to bother.

Yes, in high school, 'factoring' is seen as an important algebraic manipulation. It is. Then high school continues on and wants to factor whenever possible and for no reason other than it is possible. This is dumb.

Also, commonly there is more than one way to factor. Then high school gets all in a tizzy over which way is 'best'. Nonsense. Again, we factor for a reason we have in mind, and of several possible ways to factor we select the one for the reason we have in mind. Simple.

We factor when we have a reason to do so. Otherwise, f'get about it! High school teachers: Understand that now?

My authority: I hold a Ph.D. in the applied math of stochastic optimal control. I've taught math in college and graduate school. I've published peer-reviewed original research in applied math and mathematical statistics.

majelix · on Aug 31, 2012

The obvious counterpoint here is that we're trying to teach skills before they're needed. After all, it would suck to have to rediscover Calculus on your own in the middle of your Physics II exam just because you didn't see a point to it at the time.

ten_fingers · on Aug 31, 2012

Yes, and that's the way I learned in high school. At the time it appeared that we factored to achieve some 'artistic' goals of making the algebraic expressions 'look nice'. When later I concluded that we factored for some serious purposes and that the artistic goals of look nice were silly, I resented some of what I had been taught.

But the question on this thread is appropriate: "Why" do we factor? Sure, the reason in some of high school is just to learn how to factor so that we will be able to when we need to, say, working with integration by parts in calculus. But likely this tread and the students want a reason more substantial than just to learn for later. So, my answer was (say, beyond just learning) to factor when there was a good reason and otherwise just f'get about it, and basically that's the correct answer.

batista · on Aug 31, 2012

>My authority: I hold a Ph.D. in the applied math of stochastic optimal control. I've taught math in college and graduate school. I've published peer-reviewed original research in applied math and mathematical statistics.

With all those one would expect better reasoning and/or better wording.

ten_fingers · on Aug 31, 2012

No, my reasoning and wording are fine: Instead, "Things should be as simple as possible and not simpler"!