|
|
|
# Fuzz Testing
|
|
|
|
|
|
|
|
## Overview
|
|
|
|
|
|
|
|
From Wikipedia: "Fuzz testing or fuzzing is a software testing technique, often
|
|
|
|
automated or semi-automated, that involves providing invalid, unexpected, or
|
|
|
|
random data to the inputs of a computer program. The program is then monitored
|
|
|
|
for exceptions such as crashes, or failing built-in code assertions or for
|
|
|
|
finding potential memory leaks. Fuzzing is commonly used to test for security
|
|
|
|
problems in software or computer systems."
|
|
|
|
|
|
|
|
Current status: Our test suite contains various unit tests for
|
|
|
|
MathJax's public API, LaTeX to MathML conversions, the configuration options,
|
|
|
|
the javascript MathML rendering engine etc The idea is to have a minimal
|
|
|
|
test case to verify a specific feature (e.g. one configuration option or one
|
|
|
|
LaTeX command). Non regression tests are also created from reduced test cases
|
|
|
|
for issues entered in our tracker. This allows to automate the
|
|
|
|
verification of the fix on all platforms and to ensure that the issues won't
|
|
|
|
happen again in the future. Using a unit tests allows to easily understand a
|
|
|
|
test failure, to choose the appropriate format and tools for testing a feature
|
|
|
|
(e.g javascript tests for the MathJax API, reftests for MathML rendering) and
|
|
|
|
to avoid failures unrelated to the feature intended to be verified by a test.
|
|
|
|
|
|
|
|
Rationale for Fuzz testing: while the current approach is good for test
|
|
|
|
debugging and maintenance, it also has some shortcomings: only simple pages
|
|
|
|
are tested and we rely exclusively on user feedback to discover more involved
|
|
|
|
bugs with a complex markup, sophisticated configuration etc
|
|
|
|
Even slightly more complex markup may not be detected by our framework. A
|
|
|
|
concrete example is
|
|
|
|
[issues294](https://github.com/mathjax/MathJax/issues/294): a unit test to
|
|
|
|
reproduce this bug needed to make MathJax compute the space between a
|
|
|
|
<mmultiscripts> element and another element, but our testsuite only
|
|
|
|
tested at best a single <mmultiscripts> on a page.
|
|
|
|
Of course, it's not possible to test all possible input but using large
|
|
|
|
random pages can still be very helpful to exhibit this kind of problem.
|
|
|
|
|
|
|
|
## Basic ideas
|
|
|
|
|
|
|
|
* Randomly generate large test pages to check for:
|
|
|
|
* browser/plugin crashes
|
|
|
|
* MathJax crash (javascript error)
|
|
|
|
* [Math Processing Error]
|
|
|
|
* hangs
|
|
|
|
|
|
|
|
* A test page will contain:
|
|
|
|
* <script> tags to load MathJax and the testsuite header.
|
|
|
|
* Some configuration options
|
|
|
|
* Several LaTeX/MathML/AsciiMath fragments in various locations
|
|
|
|
* Javacript code to add/remove/move/modify nodes and attributes
|
|
|
|
* Some MathJax API calls. Especially those asking to reprocess/rerender the
|
|
|
|
page or some parts of it, change the output mode etc
|
|
|
|
* Possibly other Web languages not parsed by MathJax such that
|
|
|
|
HTML/SVG/CSS.
|
|
|
|
* Possibly some UI actions when this is implemented via Selenium 2.
|
|
|
|
|
|
|
|
* Use our current testing infrastructure:
|
|
|
|
* Create reftest manifest for the pages generated and mark them as "load"
|
|
|
|
tests.
|
|
|
|
* Run the tests in as many browsers as possible.
|
|
|
|
* The configuration may be randomly set in the page itself.
|
|
|
|
|
|
|
|
* Two interesting cases to consider:
|
|
|
|
* Pages following some kind of grammar rules (valid tests): check that
|
|
|
|
MathJax works correctly in standard situation.
|
|
|
|
* Pages violating a bit the rules (almost valid tests): check that MathJax
|
|
|
|
handles edge cases nicely.
|
|
|
|
|
|
|
|
* How pages are contructed:
|
|
|
|
* Use small fragments as starting points
|
|
|
|
* Recursively create big pieces of code by grouping together smaller
|
|
|
|
fragments. You can try to follow some grammar rules.
|
|
|
|
* Use Javascript to add mutation rules for the DOM, either before MathJax
|
|
|
|
starts (use delayStartupUntil) or after it started
|
|
|
|
(use e.g. MathJax.Hub.Typeset)
|
|
|
|
* Add random configuration options. Some of them may be mandatory to make the
|
|
|
|
page valid (e.g. extensions for a LaTeX command used in the page).
|
|
|
|
* Add random MathJax API called, simulation of UI interactions etc
|
|
|
|
|
|
|
|
* How starting points are obtained:
|
|
|
|
* Use grammar tokens (MathML tokens, LaTeX variables etc)
|
|
|
|
* Use DOM/AsciiMath/Javascript fragments from known unit tests
|
|
|
|
(for example our own test suite or Mozilla's reftests/crashtests)
|
|
|
|
|
|
|
|
* Additional processing:
|
|
|
|
* Record the fuzz actions to reproduce the bug. I plan to encode the UI
|
|
|
|
actions in the page itself and so only saving the page should be enough.
|
|
|
|
* Add "ignores" rules to avoid known bugs to be found again and remove them
|
|
|
|
once the bug is fixed.
|
|
|
|
* Reduce fuzz testcases via a divide and conquer algorithm and save it in
|
|
|
|
our crashtests/ unit tests.
|
|
|
|
* Maintain an improve the list of starting fragments and generation rules.
|
|
|
|
|
|
|
|
## Issues to consider
|
|
|
|
|
|
|
|
* Fuzz testing requires to create many times large test cases but MathJax is
|
|
|
|
slow to render large pages. Currently, the [Torture Test](https://github.com/fred-wang/MathJax-test/tree/master/testsuite/MathMLToDisplay/TortureTests/Size)
|
|
|
|
from the MathML test suite are skipped. We will have to use powerful machine,
|
|
|
|
increase Selenium timeout and perhaps run the fuzzer a long time / regularly.
|
|
|
|
|
|
|
|
* Because Fuzz testing is often used for security purposes it seems that the
|
|
|
|
source code repositories are not public to prevent people from finding
|
|
|
|
security fails. What will be our policy? Detection of
|
|
|
|
"[Math Processing Error]" is not too serious but crashes in browsers or
|
|
|
|
MathPlayer probably should probably be kept confidential.
|
|
|
|
|
|
|
|
## References
|
|
|
|
|
|
|
|
* [Fuzz_testing](https://en.wikipedia.org/wiki/Fuzz_testing) (Wikipedia)
|
|
|
|
* [Fuzzing or how to help computers cope with the unexpected](http://cdn.ttgtmedia.com/searchSecurityUK/downloads/RHUL_Fuzzing_final.pdf)
|
|
|
|
* [Jesse Ruderman's posts about Fuzzing](http://www.squarefree.com/categories/fuzzing/)
|
|
|
|
* [Fuzzing At Mozilla](http://www.squarefree.com/fuzzing2010/fuzzing2010.xhtml)
|
|
|
|
* [Analysis of Lithium's algorithm](http://www.squarefree.com/lithium/algorithm.html)
|
|
|
|
* [Bugzilla's Metabugs for fuzz-testing tools](https://bugzilla.mozilla.org/show_bug.cgi?id=316898)
|
|
|
|
* [cross_fuzz](http://lcamtuf.coredump.cx/cross_fuzz/)
|
|
|
|
* private communication with Abhishek Arya (Google) during Chrome 24's MathML
|
|
|
|
testing. |