Optimize

From Derivative
Jump to: navigation, search

All realtime systems demand serious effort and production time in the optimization stage. Performance optimizations to TouchDesigner networks can often result in speed ups in the 2 to 20 times range. This can effect what is capable of running your computer hardware and it will affect the experience of your audience.

TouchDesigner has a Performance Monitor Dialog (alt+y) under Dialogs... to assist in optimization.

The frames-per-second depends on the compute load every frame, especially the selection of procedural operators, the amount of geometry, CPU and graphics speeds, and MOST OF ALL the optimizations that the user employs in building the project.

TouchDesigner makes heavy use of both the CPU and the GPU. Reducing the workload on these two processors requires two different strategies as described below.

Quick Tip: The first thing to determine is if the GPU is the bottleneck. Try turning down your render resolution 64x64 and see if things speed up. If they do then you know it’s GPU related. It’s quite possible your issue isn’t a GPU bottleneck though, unless it’s just GLSL shaders you are running with little else going on in the system. Check the Performance Monitor to see what is taking up most of the time.

Determining Bottlenecks[edit]

CPU and GPU interaction are very pipelined. Pipelining is a feature of computing that is analogous to an assembly line in a factory. Pipelining allows for much higher speed, but when trying to optimize one you need to be very aware of bottlenecks.

Let's use a car factory as an example. Say the factory has 3 stages in its assembly line, one that makes the chassis, one that installs the engine in the chassis, and one that mounts the wheels on the chassis. Let says the chassis department can make 10 chassis and hour, the engine installers can install 7 engines an hour, and the wheel mounters can mount 15 sets of wheels an hour. Since the car is useless unless all of the parts are installed, the factory can only output 7 cars per hour. If management goes out and buys a machine that allows the wheel mounters to mount 25 sets of wheels an hour, the factory will still only output 7 cars an hour. The engine installers are the Bottleneck in the pipeline. The only way the factory can increase output is first by increasing the engine installers efficiency so they can install 10 or more engines an hour. Say they are able to increase them to 13 engines an hour. Now the factory can do 10 cars an hour, because the chassis department has now become the bottleneck, and any further improvements to the engine department are useless until the chassis department is made more efficient.

All of the above is true for TouchDesigner, where the different departments are the CPU and the GPU (they can be further split into sub-departments, but we'll deal with that elsewhere). So if your system has a CPU bottleneck, reducing GPU workload will result in no change to framerates.

CPU Bottleneck[edit]

You can use a Hog CHOP to see if you have a CPU bottleneck. If you place down a Hog CHOP and your framerate goes down, then you have a CPU bottleneck. If your frame-rate doesn't change then you have a GPU bottleneck.

Conversely, if you reduce the rendering or compositing of images (TOPs) to a very small resolution (like 32x32) and the frame rate does not change, you have a CPU bottleneck.

GPU Bottleneck[edit]

The GPU itself has a few stages to its pipeline. A few of these are Vertex Shading, Pixel Shading, Blending. For TOPs in general, 95% of the time the GPU bottleneck will be caused by the pixel shader. This happens when it is shading too many pixels and/or the the operations it needs to do per pixel are too expensive.

To determine if you have a pixel shader bottleneck try reducing the resolution of your TOPs that are cooking every frame. If your framerate goes up then you have a pixel shader bottleneck.

If you reduce the resolution to half in X and Y (Common Page), your total pixels goes down to a quarter, and then you should see the GPU time for that TOP go down to a quarter. (Middle mouse click on the node to see the GPU time of that node).

See the section below on the Render TOP for optimizing it.

See Also Phong MAT Shader Resource Usage

Optimizing CPU Usage[edit]

Performance Monitor[edit]

Use the Performance Monitor Dialog to find expensive cook or nodes that are unnecessarily cooking. With the Performance Monitor you can see the order-of-cook of one frame, and the cook times of all the stages. Also you can trigger a dump on frames that exceed a certain threshold.

OP cooking time, OP info pop-up[edit]

Each OP info pop-up (middle mouse on the OP) displays how many times it has cooked and how long its last cook took, in milliseconds.

  • cook time: how long it takes that operator to process (or 'cook').
  • number of cooks: often what is important is whether it cooks every frame or it cooks just once, or only when its input data changes.

Understand Why Nodes Cook[edit]

Refer to the Cook article.

Finding unnecessary cooking[edit]

  • Re-ordering: OPs that don't change should be moved high in the cook chain so it doesn't cook unnecessarily.
  • Combining output length of multi-input CHOPs: When 2 or more CHOPs combine (link in a Math CHOP), make sure the resulting CHOP length isn't unnecessarily long, due to one CHOP being at frame 1 and another being at the current frame.

Faster cooking[edit]

Generally speaking, DATs, SOPs, CHOPs cook faster on subsequent cooks when they don't have to add/remove cells/points/channels from their output, but merely tweak the output values from the previous cook. Therefore try to keep the amount of output being generated consistent between cooks wherever possible.

Make Sure Your Geometry Is In An Optimal Format[edit]

While the GPU actually renders geometry, there can be a large cost involved with telling the GPU all of the information it needs to render this geometry. Refer to the Optimize Geometry for Rendering article and the Render TOP for more information on this.

Transform your Geometry at the Object Level instead of in a SOP[edit]

Transform at the object level if you can. Often if you have a Transform SOP among an object's SOPs, it has the same effect of translating, rotating, or scaling an object as if you applied the same transform parameters at the object level. That is, using a Transform SOP has the same effect as using the same numbers on the object transform. However their compute times are very different. As a SOP, the CPU needs to apply a transform to every point. But if the transform is in the object, it lets the graphics hardware do the transform, which is much faster and applied to the object as a whole.

Reducing Cooking in CHOPs[edit]

The Null CHOP in Selective mode can be used to reduce downstream cooking in CHOP chains. However, the Null CHOP itself will always cook on data changes in this mode, so use with caution.

Other processes running and interrupting TouchDesigner[edit]

  • Turn off all virus checkers and spy-ware services while running TouchDesigner. These services can often use lots of CPU cycles and access the hard drive frequently.
  • Turn off all unnecessary Windows services.

Multi-CPU usage[edit]

  • You can run multiple instances of TouchDesigner on one computer. Then using the various Touch In/Outs you can move data between the processes and thus offload some work onto other processors.
  • Beware that doing this with the Touch In TOP and Touch Out TOP and large images may not get you the performance boost you are looking for as TOPs use the graphics card heavily and you only have one of those. However you could certainly do it on another physical computer and use a fast gigabit network.

Optimizing GPU Usage[edit]

Reduce Number of Pixels in Render TOP[edit]

As with other TOPs, if you reduce the resolution to half in X and Y (Common Page) of a Render TOP, your total pixels goes down to a quarter, and you should see the GPU render time go down to a quarter. (Middle mouse click on the node to see the times.)

Reduce Overdraw in Render TOP[edit]

Refer to the article on Early Depth-Test.

Another feature that can help reduce overdraw is Back-Face Culling.

Reduce Vertices in Render TOP[edit]

If the number of vertices or particles is very high, the Vertex Shader engine will affect the GPU workload, especially if you have GLSL shaders with Vertex Shader code. Try reducing the number of vertices or particles inthis case anss see if the render time goes down.

Reduce GPU Workload in Render TOP[edit]

What does affect speed greatly is the number of lights, the type of lights, and the features that are used in MATs. For example, a cone light is more expensive than a point or distance point (See Light COMP). Enabling features like Rim Lighting in the Phong MAT will increase the pixel shader cost as well (To see the cost, turn off the rim lighting and see how much faster things run).

Render TOP Parameters[edit]

There are some parameters on the Render TOPs Advanced page that can be used to speed up rendering. For example, if you are only calculating a shadow map, then you can turn on the Draw Depth Only feature. If you don't need the color output from that particular Render TOP, you can turn off Color Output Needed, which will avoid resolving the offscreen antialias buffer down into a texture which can be used as an input for other TOPs.

Reduce Workload of other TOPs[edit]

As noted above, the cost of other TOPs is generally tied directly to their resolution, reducing their resolution will have a 1:1 speedup for GPU workload. Some other TOPs are more expensive depending on their parameters, the filter size of Blur TOP is an example of this.

Optimizing Memory Usage[edit]

Audio Play CHOP[edit]

The Audio Play CHOP currently loads the entire sound file into RAM. If you have a longer sound file, you should consider using the Audio File In CHOP.

See Also[edit]

Stuttered Playback

The Graphics Processing Unit. This is the high-speed, many-core processor of the graphics card/chip that takes geometry, images and data from the CPU and creates images and processed data.

An Operator Family which operate on Channels (a series of numbers) which are used for animation, audio, mathematics, simulation, logic, UI construction, and many other applications.

An Operator Family that creates, composites and modifies images, and reads/writes images and movies to/from files and the network. TOPs run on the graphics card's GPU.

The tool built-in to TouchDesigner that analyzes and displays what TouchDesigner is doing as it generates an image.

Any of the procedural data operators. OPs do all the work in TouchDesigner. They "cook" and output data to other OPs, which ultimately result in new images, data and audio being generated. See Node.

To re-compute the output data of the Operators. An operator cooks when (1) its inputs change, (2) its Parameters change, (3) when the timeline moves forward in some cases, or (4) Scripting commands are run on the node. When the operator is a Gadget, it also cooks when a user interacts with it. When an operator cooks, it usually causes operators connected to its output to re-cook. When TouchDesigner draws the screen, it re-cooks all the necessary operators in all Networks, contributing to a frame's total "cook time".

The component types that are used to render 3D scenes: Geometry Component contain the 3D shapes to render, plus Camera, Light, Ambient Light, Null, Bone, Handle and other component types.

An Operator Family that reads, creates and modifies 3D polygons, curves, NURBS surfaces, spheres, meatballs and other 3D surface data.

The OpenGL code that creates a rendered image from polygons and textures. Shaders can be made of up to three parts: Vertex Shader, Geometry Shader and/or Pixel Shader, which are either embedded inside Materials, or placed in Text DATs and referenced to a GLSL Material.