Generating random math expression

I have this idea running around in my head, to generate and evaluate random mathematical expressions. So, I decided to give it a shot and elaborate an algorithm, before coding it to test it.

Example:

Here are some example expressions I want to generate randomly:

4 + 2                           [easy]
3 * 6 - 7 + 2                   [medium]
6 * 2 + (5 - 3) * 3 - 8         [hard]
(3 + 4) + 7 * 2 - 1 - 9         [hard]
5 - 2 + 4 * (8 - (5 + 1)) + 9   [harder]
(8 - 1 + 3) * 6 - ((3 + 7) * 2) [harder]

The easy and medium ones are pretty straight-forward. Random ints separated by random operators, nothing crazy here. But I’m having some trouble getting started with something that could create one of the hard and harder examples. I’m not even sure a single algorithm could give me the last two.

What I am considering:

I can’t say I tried those ideas, because I didn’t really want to waste much time going in a direction that had no chance of working in the first place. But still, I thought of a couple solutions:

  • Using trees
  • Using regular expressions
  • Using a crazy “for-type” loop (surely the worst)

What I’m looking for:

I’d like to know which way you believe is the best to go, between the solutions I considered, and your own ideas.

If you see a good way to start, I’d appreciate a lead in the right direction, e.g. with the beginning of the algorithm, or a general structure of it.

Also note that I will have to evaluate those expressions. This can be done either after the expression is generated, or during its creation. If you take that in consideration in your answer, that’s great.

I’m not looking for anything language-related, but for the record, I’m thinking of implementing it in Objective-C, as that’s the language I’m most working with recently.

Those examples did not include the : operator, as I only want to manipulate ints, and this operator adds many verifications. If your answer gives a solution handling this one, that’s great.

If my question needs any clarification, please ask in the comments. Thanks for your help.

3

Here’s a theoretic interpretation of your problem.

You are looking to randomly generate words (algebraic expression) from a given language (the infinite set of all syntactically correct algebraic expressions). Here’s a formal description of a simplified algebraic grammar supporting only addition and multiplication:

E -> I 
E -> (E '+' E)
E -> (E '*' E)

Here, E is an expression (i.e., a word of your language) and I is a terminal symbol (i.e., it’s not expanded any further) representing an integer. The above definition for E has three production rules. Based on this definition, we can randomly build a valid arithmetic as follows:

  1. Start with E as the single symbol of the output word.
  2. Choose uniformly at random one of the non-terminal symbols.
  3. Choose uniformly at random one of the production rules for that symbol, and apply it.
  4. Repeat steps 2 – 4 until only terminal symbols are left.
  5. Replace all terminal symbols I by random integers.

Here’s an example of the application of this algorithms:

E
(E + E)
(E + (E * E))
(E + (I * E))
((E + E) + (I * E))
((I + E) + (I * E))
((I + E) + (I * I))
((I + (E * E)) + (I * I))
((I + (E * I)) + (I * I))
((I + (I * I)) + (I * I))
((2 + (5 * 1)) + (7 * 4))

I assume you would choose to represent an expression with an interface Expression which is implemented by classes IntExpression, AddExpression and MultiplyExpression. The latter two then would have a leftExpression and rightExpression. All Expression subclasses are required to implement an evaluate method, which works recursively on the tree structure defined by these objects and effectively implements the composite pattern.

Note that for the above grammar and algorithm, the probability of expanding an expression E into a terminal symbol I is only p = 1/3, while the probability to expand an expression into two further expressions is 1-p = 2/3. Therefore, the expected number of integers in a formula produced by the above algorithm is actually infinite. The expected length of an expression is subject to the recurrence relation

l(0) = 1
l(n) = p * l(n-1) + (1-p) * (l(n-1) + 1)
     = l(n-1) + (1-p)

where l(n) denotes the expected length of the arithmetic expression after n applications of production rules. I therefore suggest that you assign a rather high probability p to the rule E -> I such that you end up with a fairly small expression with high probability.

EDIT: If you’re worried that the above grammar produces too many parenthesis, look at Sebastian Negraszus’ answer, whose grammar avoids this problem very elegantly.

11

first of all I’d actually generate the expression in postfix notation, you can easily convert to infix or evaluate after generating your random expression, but doing it in postfix means you don’t need to worry about parenthesis or precedence.

I’d also keep a running total of the number of terms available to the next operator in your expression (assuming you want to avoid generating expressions that are malformed) i.e. somthing like this:

string postfixExpression =""
int termsCount = 0;
while(weWantMoreTerms)
{
    if (termsCount>= 2)
    {
         var next = RandomNumberOrOperator();
         postfixExpression.Append(next);
         if(IsNumber(next)) { termsCount++;}
         else { termsCount--;}
    }
    else
    {
       postfixExpression.Append(RandomNumber);
       termsCount++;
     }
}

obviously this is pseudo code so is not tested/may contain mistakes and you probably wouldn’t use a string but a stack of some discriminated union like type

9

blubb’s answer is a good start, but his formal grammar creates too many parantheses.

Here’s my take on it:

E -> I
E -> M '*' M
E -> E '+' E
M -> I
M -> M '*' M
M -> '(' E '+' E ')'

E is an expression, I an integer and M is an expression that is an argument for a multiplication operation.

2

The parentheses in the “hard” expression represent order of evaluation. Rather than trying to generate the displayed form directly, just come up with a list of operators in random order, and derive the display form of the expression from that.

Numbers: 1 3 3 9 7 2

Operators: + * / + *

Result: ((1 + 3) * 3 / 9 + 7) * 2

Deriving the display form is a relatively simple recursive algorithm.

Update: here is an algorithm in Perl to generate the display form. Because + and * are distributive, it randomizes the order of the terms for those operators. That helps keep the parentheses from all building up on one side.

use warnings;
use strict;

sub build_expression
{
    my ($num,$op) = @_;

    #Start with the final term.
    my $last_num = pop @$num; 
    my $last_op = pop @$op;

    #Base case: return the number if there is just a number 
    return $last_num unless defined $last_op;

    #Recursively call for the expression minus the final term.
    my $rest = build_expression($num,$op); 

    #Add parentheses if there is a bare + or - and this term is * or /
    $rest = "($rest)" if ($rest =~ /[+-][^)]+$|^[^)]+[+-]/ and $last_op !~ /[+-]/);

    #Return the two components in a random order for + or *.
    return $last_op =~ m|[-/]| || rand(2) >= 1 ? 
        "$rest $last_op $last_num" : "$last_num $last_op $rest";        
}

my @numbers   = qw/1 3 4 3 9 7 2 1 10/;
my @operators = qw|+ + * / + * * +|;

print build_expression([@numbers],[@operators]) , "n";

7

To expand on the tree approach, let’s say each node is either a leaf or a binary expression:

Node := Leaf | Node Operator Node

Note that a leaf is just a randomly-generated integer here.

Now, we can randomly generate a tree. Deciding the probability of each node being a leaf allows us to control the expected depth, although you might want an absolute max depth as well:

Node random_tree(leaf_prob, max_depth)
    if (max_depth == 0 || random() > leaf_prob)
        return random_leaf()

    LHS = random_tree(leaf_prob, max_depth-1)
    RHS = random_tree(leaf_prob, max_depth-1)
    return Node(LHS, RHS, random_operator())

Then, the simplest rule for printing the tree is to wrap () around each non-leaf expression and avoid worrying about operator precedence.


For example, if I parenthesize your last sample expression:

(8 - 1 + 3) * 6 - ((3 + 7) * 2)
((((8 - 1) + 3) * 6) - ((3 + 7) * 2))

you can read off the tree that would generate it:

                    SUB
                  /      
               MUL        MUL
             /     6     /   2
          ADD          ADD
         /   3        3   7
       SUB
      8   1

I would use trees. They can give you great control on the generation of the expressions. E.g. you can limit the depth per branch and width of each level separately. Tree based generation also gives the answer already during the generation, which is useful if you want to ensure that also the result (and subresults) are hard enough and/or not too hard to solve. Especially if you add division operator at some point, you can generate expressions that evaluate to whole numbers.

1

Here’s a slightly different take on Blubb’s excellent answer:

What you’re trying to build here is essentially a parser that operates in reverse. What your problem and a parser have in common is a context-free grammar, this one in Backus-Naur form:

digit ::= '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9'
number ::= <digit> | <digit> <number>
op ::= '+' | '-' | '*' | '/'
expr ::= <number> <op> <number> | '(' <expr> ')' | '(' <expr> <op> <expr> ')'

Parsers start with a stream of terminals (literal tokens like 5 or *) and try to assemble them into nonterminals (things composed of terminals and other nonterminals, such as number or op). Your problem starts with nonterminals and works in reverse, picking anything between the “or” (pipe) symbols at random when one is encountered and recursively repeating the process until reaching a terminal.

A couple of the other answers have suggested that this is a tree problem, which it is for a certain narrow class of cases where there are no nonterminals that reference themselves directly or indirectly through another nonterminal. Since grammars allow that, this problem is really a directed graph. (Indirect references through another nonterminals count toward this as well.)

There was a program called Spew published on Usenet in the late 1980s which was originally designed to generate random tabloid headlines and also happens to be a great vehicle for experimenting with these “reverse grammars.” It operates by reading a template that directs the production of a random stream of terminals. Beyond its amusement value (headlines, country songs, pronounceable English gibberish), I’ve written numerous templates that are useful for generating test data that’s ranged from plain text to XML to syntactically-correct-but-uncompilable C. Despite being 26 years old and written in K&R C and having an ugly template format, it compiles just fine and works as advertised. I whipped up a template that solves your problem and posted it on pastebin since adding that much text here doesn’t seem appropriate.

Here is a simple recursive method in javascript:

var genExpression = function(h){
    var n = max(1,h + random(-2,0));
    var s = "" + floor(random(1,11));
    var a = ["+","-","*","/","^"];
    var u = ["√","sin"];
    for(var i = 0; i < n; i ++){
        s += (" " + a[floor(random(0,a.length))]);
        if(random() < 0.25 && h > 4){
            s += (" " + u[floor(random(0,u.length))]);
            s += ("(" + genExpression(h - 4) + ")");
            s += (" " + a[floor(random(0,a.length))]);
        }
        if(random() < 0.5 && h > 3){
            s += (" (" + genExpression(h - 2) + ")");
        }
        else{
            s += (" " + floor(random(1,11)));
        }
    }
    return s;
};

So genExpression(4) might return something like 2 ^ (8 / 10 - 1 - 1) / 2 + 5 + (3 / 9 / 7) + √(2 ^ 3) ^ 5.

1

My manydl.c program on github is generating randomly some C code as generated plugins. The generated C code contains random expressions.

In practice, the generated C code is compiled by GCC as a plugin, then dlopen(3)-ed, then dlsym(3)-ed.

Generating C or C++ code was and is used in GCC MELT and in RefPerSys.

A book explaining why is it useful is: Jacques Pitrat’s Artificial Beings, the conscience of a conscious machine

I have been able to generate half a million of more or less random C files, compile each generated file with gcc -O -fPIC -shared -Wall into a plugin, and dlopen each plugin.

Experimentally the program is never looping, but I cannot prove that.

NB: feel free to email me.

Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa Dịch vụ tổ chức sự kiện 5 sao Thông tin về chúng tôi Dịch vụ sinh nhật bé trai Dịch vụ sinh nhật bé gái Sự kiện trọn gói Các tiết mục giải trí Dịch vụ bổ trợ Tiệc cưới sang trọng Dịch vụ khai trương Tư vấn tổ chức sự kiện Hình ảnh sự kiện Cập nhật tin tức Liên hệ ngay Thuê chú hề chuyên nghiệp Tiệc tất niên cho công ty Trang trí tiệc cuối năm Tiệc tất niên độc đáo Sinh nhật bé Hải Đăng Sinh nhật đáng yêu bé Khánh Vân Sinh nhật sang trọng Bích Ngân Tiệc sinh nhật bé Thanh Trang Dịch vụ ông già Noel Xiếc thú vui nhộn Biểu diễn xiếc quay đĩa Dịch vụ tổ chức tiệc uy tín Khám phá dịch vụ của chúng tôi Tiệc sinh nhật cho bé trai Trang trí tiệc cho bé gái Gói sự kiện chuyên nghiệp Chương trình giải trí hấp dẫn Dịch vụ hỗ trợ sự kiện Trang trí tiệc cưới đẹp Khởi đầu thành công với khai trương Chuyên gia tư vấn sự kiện Xem ảnh các sự kiện đẹp Tin mới về sự kiện Kết nối với đội ngũ chuyên gia Chú hề vui nhộn cho tiệc sinh nhật Ý tưởng tiệc cuối năm Tất niên độc đáo Trang trí tiệc hiện đại Tổ chức sinh nhật cho Hải Đăng Sinh nhật độc quyền Khánh Vân Phong cách tiệc Bích Ngân Trang trí tiệc bé Thanh Trang Thuê dịch vụ ông già Noel chuyên nghiệp Xem xiếc khỉ đặc sắc Xiếc quay đĩa thú vị
Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa
Thiết kế website Thiết kế website Thiết kế website Cách kháng tài khoản quảng cáo Mua bán Fanpage Facebook Dịch vụ SEO Tổ chức sinh nhật