# Differences

This shows you the differences between two versions of the page.

— |
parcon18:project4 [2018/03/13 21:10] (current) romain created |
||
---|---|---|---|

Line 1: | Line 1: | ||

+ | ====== Barnes-Hut Simulation ====== | ||

+ | In this assignment, you will implement the parallel Barnes-Hut algorithm for //N-body simulation//. | ||

+ | N-body simulation is a simulation of a system of //N// particles that interact with physical | ||

+ | forces, such as gravity or electrostatic force. | ||

+ | Given initial positions and velocities of all the //particles// (or //bodies//), the N-body simulation | ||

+ | computes the new positions and velocities of the particles as the time progresses. | ||

+ | It does so by dividing time into discrete short intervals, and computing the | ||

+ | positions of the particles after each interval. | ||

+ | |||

+ | ===== Handout ===== | ||

+ | |||

+ | {{:parcon17:barneshut.zip|Click here to obtain the handout}} for this assignment. Do not forget to create a branch for this assignment. Instructions on how to do so are [[parcon18:git-setup|available here]]. | ||

+ | |||

+ | ---- | ||

+ | |||

+ | Let us recall some basics of classical mechanics. | ||

+ | What makes a particle, such as an atom, a comet or a star, move through space? | ||

+ | First, the particle moves through space as a consequence of having some initial velocity | ||

+ | compared to other particles. | ||

+ | Second, the particle changes its velocity as a result of external forces. | ||

+ | These forces are in direct correlation with the proximity of other particles. | ||

+ | For example, Earth moves through space with some initial velocity, | ||

+ | which is constantly being changed due to the gravitational force between the Earth and the Sun. | ||

+ | |||

+ | Before we study the Barnes-Hut algorithm for the N-body simulation problem, | ||

+ | we will focus on a simpler algorithm -- the //direct sum N-body algorithm//. | ||

+ | The direct sum algorithm consists of multiple iterations, each of which performs | ||

+ | the following steps for each particle: | ||

+ | |||

+ | == Step 1 == | ||

+ | |||

+ | The particle position is updated according to its current velocity (`delta` is a short time period). | ||

+ | |||

+ | x' = x + v_x * delta | ||

+ | y' = y + v_y * delta | ||

+ | |||

+ | == Step 2 == | ||

+ | |||

+ | The net force on the particle is computed by adding the individual forces from all the other particles. | ||

+ | |||

+ | F_x = F_1x + F_2x + F_3x + ... + F_Nx | ||

+ | F_y = F_1y + F_2y + F_3y + ... + F_Ny | ||

+ | |||

+ | == Step 3 == | ||

+ | |||

+ | The particle velocity is updated according to the net force on that particle. | ||

+ | |||

+ | v_x' = v_x + F_x / mass * delta | ||

+ | v_y' = v_y + F_y / mass * delta | ||

+ | |||

+ | |||

+ | ---- | ||

+ | |||

+ | |||

+ | In this exercise, we will assume that the force between particles is the //gravitational force// | ||

+ | from classical mechanics. Let's recall the formula for the gravitational force between two stellar bodies: | ||

+ | |||

+ | F = G * (m1 * m2) / distance^2 | ||

+ | |||

+ | Above, `F` is the absolute value of the gravitational force, `m1` and `m2` are the masses of the two bodies, | ||

+ | and `r` is the distance between them. `G` is the gravitational constant. | ||

+ | |||

+ | The gravitational force vector always points towards the other body. | ||

+ | The `F_x` and `F_y` components of the gravitational force on the body `m1` | ||

+ | can be computed by observing the following triangle similarity: | ||

+ | |||

+ | distance = math.sqrt((x2 - x1) ^ 2 + (y2 - y1) ^ 2) | ||

+ | F_x / F = (x2 - x1) / distance | ||

+ | F_y / F = (y2 - y1) / distance | ||

+ | |||

+ | This is shown in the following figure: | ||

+ | |||

+ | {{ :parcon17:force-components.png?nolink |}} | ||

+ | |||

+ | For each particle, the net force is computed by summing the components of individual forces from all other particles, | ||

+ | as shown in the following figure: | ||

+ | |||

+ | {{ :parcon17:net-force.png?nolink |}} | ||

+ | |||

+ | The direct sum N-body algorithm is very simple, but also inefficient. | ||

+ | Since we need to update `N` particles, and compute `N - 1` force contributions for each of those particles, | ||

+ | the overall complexity of an iteration step of this algorithm is `O(N^2)`. | ||

+ | As the number of particles grows larger, the direct sum N-body algorithm becomes prohibitively expensive. | ||

+ | |||

+ | The Barnes-Hut algorithm is an optimization of the direct sum N-body algorithm, | ||

+ | and is based on the following observation: | ||

+ | |||

+ | ---- | ||

+ | |||

+ | //If a cluster of bodies is sufficiently distant from a body //A//, the net force on //A// | ||

+ | from that cluster can be approximated with one big body with the mass of all the | ||

+ | bodies in the cluster, positioned at the center of mass of the cluster.// | ||

+ | |||

+ | ---- | ||

+ | |||

+ | This is illustrated in the following figure: | ||

+ | |||

+ | {{ :parcon17:observation.png?nolink |}} | ||

+ | |||

+ | To take advantage of this observation, the Barnes-Hut algorithm relies on | ||

+ | a //quadtree// -- a data structure that divides the space into cells, and answers queries | ||

+ | such as 'What is the total mass and the center of mass of all the particles in this cell?'. | ||

+ | The following figure shows an example of a quadtree for 6 bodies: | ||

+ | |||

+ | {{ :parcon17:quadtree.png?nolink |}} | ||

+ | |||

+ | Above, the total force from the bodies //B//, //C//, //D// and //E// on the body //A// can be approximated | ||

+ | by a single body with //mass// equal to the sum of masses //B//, //C//, //D// and //E//, | ||

+ | positioned at the center of mass of bodies //B//, //C//, //D// and //E//. | ||

+ | The center of mass `(massX, massY)` is computed as follows: | ||

+ | |||

+ | mass = m_B + m_C + m_D + m_E | ||

+ | massX = (m_B * x_B + m_C * x_C + m_D * x_D + m_E * x_E) / mass | ||

+ | massY = (m_B * y_B + m_C * y_C + m_D * y_D + m_E * y_E) / mass | ||

+ | |||

+ | An iteration of the Barnes-Hut algorithm is composed of the following steps: | ||

+ | |||

+ | - Construct the quadtree for the current arrangement of the bodies. | ||

+ | - Determine the //boundaries//, i.e. the square into which all bodies fit. | ||

+ | - Construct a quadtree that covers the boundaries and contains all the bodies. | ||

+ | - Update the bodies -- for each body: | ||

+ | - Update the body position according to its current velocity. | ||

+ | - Using the quadtree, compute the net force on the body by adding the individual forces from all the other bodies. | ||

+ | - Update the velocity according to the net force on that body. | ||

+ | |||

+ | It turns out that, for most spatial distribution of bodies, | ||

+ | the expected number of cells that contribute to the net force on a body is `log n`, | ||

+ | so the overall complexity of the Barnes-Hut algorithm is `O(n log n)`. | ||

+ | |||

+ | ---- | ||

+ | |||

+ | Now that we covered all the necessary theory, let's finally dig into the implementation! | ||

+ | You will implement: | ||

+ | |||

+ | - a quadtree and its combiner data structure | ||

+ | - an operation that computes the total force on a body using the quadtree | ||

+ | - a simulation step of the Barnes-Hut algorithm | ||

+ | |||

+ | Since this assignment consists of multiple components, | ||

+ | we will follow the principles of //test-driven development// and test each component separately, | ||

+ | before moving on to the next component. | ||

+ | That way, if anything goes wrong, we will more precisely know where the error is. | ||

+ | It is always better to detect errors sooner, rather than later. | ||

+ | |||

+ | |||

+ | ==== Data Structures ==== | ||

+ | |||

+ | We will start by implementing the necessary data structures: the quadtree, | ||

+ | the body data-type and the sector matrix. | ||

+ | You will find the stubs in the `package.scala` file of the `barneshut` package. | ||

+ | |||

+ | |||

+ | === Quadtree Data Structure === | ||

+ | |||

+ | In this part of the assignment, we implement the quadtree data structure, | ||

+ | denoted with the abstract data-type `Quad`. | ||

+ | Every `Quad` represents a square cell of space, and can be one of the following node types: | ||

+ | |||

+ | - an `Empty` node, which represents an empty quadtree | ||

+ | - a `Leaf` node, which represents one or more bodies | ||

+ | - a `Fork` node, which divides a spatial cell into four quadrants | ||

+ | |||

+ | The definition of `Quad` is as follows: | ||

+ | |||

+ | sealed abstract class Quad { | ||

+ | def massX: Float | ||

+ | def massY: Float | ||

+ | def mass: Float | ||

+ | def centerX: Float | ||

+ | def centerY: Float | ||

+ | def size: Float | ||

+ | def total: Int | ||

+ | def insert(b: Body): Quad | ||

+ | } | ||

+ | |||

+ | Here, `massX` and `massY` represent the center of mass of the bodies in the respective cell, | ||

+ | `mass` is the total mass of bodies in that cell, `centerX` and `centerY` are the coordinates | ||

+ | of the center of the cell, `size` is the length of the side of the cell, | ||

+ | and `total` is the total number of bodies in the cell. | ||

+ | |||

+ | Note that we consider the top left corner to be at coordinate (0, 0). We also consider the x axis to grow to the right and the y axis to the bottom. | ||

+ | |||

+ | {{ :parcon17:cell.png?nolink |}} | ||

+ | |||

+ | The method `insert` creates a new quadtree which additionally contains the body `b`, | ||

+ | and covers the same area in space as the original quadtree. | ||

+ | Quadtree is an //immutable// data structure -- `insert` does not modify the existing `Quad` object. | ||

+ | Note that `Body` has the following signature: | ||

+ | |||

+ | class Body(val mass: Float, val x: Float, val y: Float, val xspeed: Float, val yspeed: Float) | ||

+ | |||

+ | In this part of the exercise, you only need to know about body's position `x` and `y`. | ||

+ | |||

+ | Let's start by implementing the simplest `Quad` type -- the empty quadtree: | ||

+ | |||

+ | case class Empty(centerX: Float, centerY: Float, size: Float) extends Quad | ||

+ | |||

+ | The center and the size of the `Empty` quadtree are specified in its constructor. | ||

+ | The `Empty` tree does not contain any bodies, so we specify that its center of mass is equal to its center. | ||

+ | |||

+ | Next, let's implement the `Fork` quadtree: | ||

+ | |||

+ | case class Fork(nw: Quad, ne: Quad, sw: Quad, se: Quad) extends Quad | ||

+ | |||

+ | This node is specified by four child quadtrees `nw`, `ne`, `sw` and `se`, | ||

+ | in the northwest, northeast, southwest and southeast quadrant, respectively. | ||

+ | The constructor assumes that the children nodes that represent four adjacent cells of the | ||

+ | same size and adjacent to each other, as in the earlier figure. | ||

+ | The center of the `Fork` quadtree is then specified by, say, the | ||

+ | lower right corner of the quadtree `nw`. If the `Fork` quadtree is empty, the | ||

+ | center of mass coincides with the center. | ||

+ | |||

+ | Inserting into a `Fork` is recursive -- it updates the respective child and creates a new `Fork`. | ||

+ | |||

+ | Finally, the `Leaf` quadtree represents one or more bodies: | ||

+ | |||

+ | case class Leaf(centerX: Float, centerY: Float, size: Float, bodies: Seq[Body]) | ||

+ | extends Quad | ||

+ | |||

+ | If the `size` of a `Leaf` is greater than a predefined constant `minimumSize`, | ||

+ | inserting an additonal body into that `Leaf` quadtree creates a `Fork` quadtree | ||

+ | with empty children, and adds all the bodies into that `Fork` (including the new body). | ||

+ | Otherwise, inserting creates another `Leaf` with all the existing bodies and the new one. | ||

+ | |||

+ | |||

+ | === The Body Data-Type === | ||

+ | |||

+ | Next, we can implement the `Body` data-type: | ||

+ | |||

+ | class Body(val mass: Float, val x: Float, val y: Float, val xspeed: Float, val yspeed: Float) { | ||

+ | def updated(quad: Quad): Body = ??? | ||

+ | } | ||

+ | |||

+ | Here, `xspeed` and `yspeed` represent the velocity of the body, `mass` is its mass, | ||

+ | and `x` and `y` are the coordinates of the body. | ||

+ | |||

+ | The most interesting method on the `Body` is `updated` -- it takes a quadtree and | ||

+ | returns the updated version of the `Body`: | ||

+ | |||

+ | def updated(quad: Quad): Body | ||

+ | |||

+ | This method is already half-completed for you -- you only need to implement | ||

+ | its nested method `traverse`, which goes through the quadtree and proceeds casewise: | ||

+ | |||

+ | - empty quadtree does not affect the net force | ||

+ | - each body in a leaf quadtree adds some net force | ||

+ | - a fork quadtree that is sufficiently far away acts as a single point of mass | ||

+ | - a fork quadtree that is not sufficiently far away must be recursively traversed | ||

+ | |||

+ | When are we allowed to approximate a cluster of bodies with a single point? | ||

+ | The heuristic that is used is that the size of the cell divided by the distance | ||

+ | `dist` between the center of mass and the particle is less than some constant `theta`: | ||

+ | |||

+ | quad.size / dist < theta | ||

+ | |||

+ | //__Hint:__ make sure you use the `distance` to compute distance between points, | ||

+ | the `theta` value for the condition, and `addForce` to add force contributions!// | ||

+ | |||

+ | Before proceeding, make sure to run tests against your `Quad` and `Body` implementations. | ||

+ | |||

+ | |||

+ | === The Sector Matrix === | ||

+ | |||

+ | The last data structure that we will implement is the //sector matrix//. | ||

+ | In this data structure, we will use the auxiliary class `Boundaries`, which | ||

+ | contains the `minX`, `maxX`, `minY` and `maxY` fields for the boundaries of the scene: | ||

+ | |||

+ | class Boundaries { | ||

+ | var minX: Float | ||

+ | var minY: Float | ||

+ | var maxX: Float | ||

+ | var maxY: Float | ||

+ | def size = math.max(maxX - minX, maxY - minY) | ||

+ | } | ||

+ | |||

+ | We will also rely on the `ConcBuffer` data structure, mentioned in the lecture: | ||

+ | |||

+ | class ConcBuffer[T] | ||

+ | |||

+ | The `ConcBuffer` class comes with efficient `+=`, `combine` and `foreach` operations, | ||

+ | which add elements into the buffer, combine two buffers and traverse the buffer, respectively. | ||

+ | The sector matrix additionally has the `toQuad` method, which returns a quadtree that contains all | ||

+ | the elements previously added with the `+=` method. | ||

+ | Recall from the lectures that this combination of methods make the `ConcBuffer` a //combiner//. | ||

+ | |||

+ | The `SectorMatrix` is just a square matrix that covers a square region of space | ||

+ | specified by the boundaries: | ||

+ | |||

+ | class SectorMatrix(val boundaries: Boundaries, val sectorPrecision: Int) { | ||

+ | val sectorSize = boundaries.size / sectorPrecision | ||

+ | val matrix = new Array[ConcBuffer[Body]](sectorPrecision * sectorPrecision) | ||

+ | for (i <- 0 until matrix.length) matrix(i) = new ConcBuffer | ||

+ | def apply(x: Int, y: Int) = matrix(y * sectorPrecision + x) | ||

+ | } | ||

+ | |||

+ | The `sectorPrecision` argument denotes the width and height of the matrix, and | ||

+ | each entry contains a `ConcBuffer[Body]` object. Effectively, the `SectorMatrix` is a //combiner// -- | ||

+ | it partitions the square region of space into `sectorPrecision` times `sectorPrecision` buckets, called //sectors//. | ||

+ | |||

+ | {{ :parcon17:sectormatrix.png?nolink |}} | ||

+ | |||

+ | Combiners such as the `SectorMatrix` are used in parallel programming to partition the results into | ||

+ | some intermediate form that is more amenable to parallelization. | ||

+ | Recall from the lecture that one of the ways to implement a //combiner// is by using | ||

+ | a bucket data structure -- we will do exactly that in this part of the exercise! | ||

+ | We will add three methods on the `SectorMatrix` that will make it a combiner. | ||

+ | We start with the `+=` method: | ||

+ | |||

+ | def +=(b: Body): SectorMatrix = { | ||

+ | ??? | ||

+ | this | ||

+ | } | ||

+ | |||

+ | This method should use the body position, `boundaries` and `sectorPrecision` | ||

+ | to determine the sector into which the body should go into, | ||

+ | and add the body into the corresponding `ConcBuffer` object. | ||

+ | |||

+ | Next, we implement the `combine` method, which takes another `SectorMatrix`, | ||

+ | and creates a `SectorMatrix` which contains the elements of both input | ||

+ | `SectorMatrix` data structures: | ||

+ | |||

+ | def combine(that: SectorMatrix): SectorMatrix | ||

+ | |||

+ | This method calls `combine` on the pair of `ConcBuffer`s in `this` and `that` | ||

+ | matrices to produce the `ConcBuffer` for the resulting matrix. | ||

+ | |||

+ | {{ :parcon17:combine.png?nolink |}} | ||

+ | |||

+ | The nice thing about the sector matrix is that a quadtree can be constructed | ||

+ | in parallel for each sector. Those little quadtrees can then be linked together. | ||

+ | The `toQuad` method on the `SectorMatrix` does this: | ||

+ | |||

+ | def toQuad(parallelism: Int): Quad | ||

+ | |||

+ | This method is already implemented -- you can examine | ||

+ | it if you would like to know how it works. | ||

+ | |||

+ | Congratulations, you just implemented your first combiner (or second, if you've attended the live exercise sessions)! | ||

+ | Before proceeding, make sure to run those unit tests. | ||

+ | |||

+ | |||

+ | ==== Implementing Barnes-Hut ==== | ||

+ | |||

+ | Now that we have all the right data structures ready and polished, | ||

+ | implementing Barnes-Hut becomes a piece of cake. | ||

+ | |||

+ | Take a look at the file `Simulator.scala`, which contains the implementation | ||

+ | of the Barnes-Hut simulator, and in particular the `step` method. | ||

+ | The `step` method represents one step in the simulation: | ||

+ | |||

+ | def step(bodies: Seq[Body]): (Seq[Body], Quad) = { | ||

+ | // 1. compute boundaries | ||

+ | val boundaries = computeBoundaries(bodies) | ||

+ | | ||

+ | // 2. compute sector matrix | ||

+ | val sectorMatrix = computeSectorMatrix(bodies, boundaries) | ||

+ | | ||

+ | // 3. compute quadtree | ||

+ | val quad = computeQuad(sectorMatrix) | ||

+ | | ||

+ | // 4. eliminate outliers | ||

+ | val filteredBodies = eliminateOutliers(bodies, sectorMatrix, quad) | ||

+ | | ||

+ | // 5. update body velocities and positions | ||

+ | val newBodies = updateBodies(filteredBodies, quad) | ||

+ | | ||

+ | (newBodies, quad) | ||

+ | } | ||

+ | |||

+ | The pre-existing code in `step` nicely summarizes what this method does. | ||

+ | |||

+ | |||

+ | === Computing the Scene Boundaries === | ||

+ | |||

+ | First, we must compute the boundaries of all the bodies in the scene. | ||

+ | Since bodies move and the boundaries dynamically change, | ||

+ | we must do this in every iteration of the algorithm. | ||

+ | The `computeBoundaries` method is already implemented -- it uses the `aggregate` | ||

+ | combinator on the sequence of bodies to compute the boundaries: | ||

+ | |||

+ | def computeBoundaries(bodies: Seq[Body]): Boundaries = { | ||

+ | val parBodies = bodies.par | ||

+ | parBodies.tasksupport = taskSupport | ||

+ | parBodies.aggregate(new Boundaries)(updateBoundaries, mergeBoundaries) | ||

+ | } | ||

+ | |||

+ | How does this work? The `aggregate` method divides the input sequence into a | ||

+ | number of chunks. For each of the chunks, it uses the `new Boundaries` expression | ||

+ | to create the accumulation value, and then folds the values in that chunk | ||

+ | calling `updateBoundaries` on each body, in the same way a `foldLeft` operation would. | ||

+ | Finally, `aggregate` combines the results of different chunks using a reduction tree and `mergeBoundaries`. | ||

+ | |||

+ | So, we need the `updateBoundaries` method: | ||

+ | |||

+ | def updateBoundaries(boundaries: Boundaries, body: Body): Boundaries | ||

+ | |||

+ | Given an existing `boundaries` object and a body, the `updateBoundaries` updates | ||

+ | the `minX`, `minY`, `maxX` and `maxY` values so that the boundaries include the body. | ||

+ | |||

+ | Next, the `mergeBoundaries` method creates a new `Boundaries` object, which represents | ||

+ | the smallest rectangle that contains both the input boundaries: | ||

+ | |||

+ | def mergeBoundaries(a: Boundaries, b: Boundaries): Boundaries | ||

+ | |||

+ | Question: Is `mergeBoundaries` associative? Is it commutative? Does it need to be commutative? | ||

+ | |||

+ | Implement these two methods, and test that they work correctly! | ||

+ | |||

+ | |||

+ | === Building the Quadtree === | ||

+ | |||

+ | Next, we need to build a `Quad` tree from the sequence of bodies. | ||

+ | We will first implement the `computeSectorMatrix` method to get the `SectorMatrix`: | ||

+ | |||

+ | def computeSectorMatrix(bodies: Seq[Body], boundaries: Boundaries): SectorMatrix | ||

+ | |||

+ | //__Hint:__ aggregate the `SectorMatrix` from the sequence of bodies, the same way it was used for boundaries. | ||

+ | Use the SECTOR_PRECISION constant when creating a new `SectorMatrix`.// | ||

+ | |||

+ | Test that these methods work correctly before proceeding! | ||

+ | |||

+ | |||

+ | === Eliminating Outliers === | ||

+ | |||

+ | During the execution of the Barnes-Hut algorithm, some of the bodies tend to | ||

+ | move far away from most of the other bodies. There are many ways to deal with such //outliers//, | ||

+ | but to keep things simple, we will eliminate bodies that move too fast and too far away. | ||

+ | |||

+ | We will not go into details of how this works, but if you'd like to know more, | ||

+ | you can try to understand how the `eliminateOutliers` method works. | ||

+ | |||

+ | |||

+ | === Updating Bodies === | ||

+ | |||

+ | The `updateBodies` method uses the quadtree to map each body from the | ||

+ | previous iteration of the algorithm to a new iteration: | ||

+ | |||

+ | def updateBodies(bodies: Seq[Body], quad: Quad): Seq[Body] | ||

+ | |||

+ | Recall that we already implemented the `updated` method which updates a single body. | ||

+ | |||

+ | |||

+ | ==== Running Barnes-Hut ==== | ||

+ | |||

+ | At last, the parallel Barnes-Hut algorithm is implemented. | ||

+ | Note that, despite all the work, we kept our Barnes-Hut algorithm implementation simple | ||

+ | and avoided the details that a more realistic implementation must address. In particular: | ||

+ | |||

+ | - we represented each body as a single point in space | ||

+ | - we restricted the simulation to two-dimensional space | ||

+ | - we ignored close encounter effects, such as body collision or tearing | ||

+ | - we ignored any relativistic effects, and assume classical mechanics | ||

+ | - we ignored errors induced by floating point computations | ||

+ | |||

+ | You can now run it as follows: | ||

+ | |||

+ | > runMain barneshut.BarnesHut | ||

+ | |||

+ | To visualize the quadtree, press the //Show quad// button, and then hit the //Start/Pause// button. | ||

+ | |||

+ | Play with the parallelism level and the number of bodies, and observe the average speedups in the lower right corner. | ||

+ | Then sit back, and enjoy the show! |