Metal by Tutorials
Metal by Tutorials
Metal by Tutorials By Caroline Begbie & Marius Horga Copyright ©2022 Razeware LLC.
Notice of Rights All rights reserved. No part of this book or corresponding materials (such as text, images, or source code) may be reproduced or distributed by any means without prior written permission of the copyright owner.
Notice of Liability This book and all corresponding materials (such as source code) are provided on an “as is” basis, without warranty of any kind, express or implied, including but not limited to the warranties of merchantability, fitness for a particular purpose, and noninfringement. In no event shall the authors or copyright holders be liable for any claim, damages or other liability, whether in action of contract, tort or otherwise, arising from, out of or in connection with the software or the use of other dealing in the software.
Trademarks All trademarks and registered trademarks appearing in this book are the property of their own respective owners.
raywenderlich.com
2
Metal by Tutorials
Table of Contents: Overview Book License ............................................................................................. 17
Before You Begin ................................................................ 18 What You Need ........................................................................................ 19 Book Source Code & Forums ............................................................. 20 Acknowledgments .................................................................................. 23 Introduction .............................................................................................. 24
Section I: Beginning Metal............................................... 28 Chapter 1: Hello, Metal! .......................................................... 29 Chapter 2: 3D Models ............................................................... 44 Chapter 3: The Rendering Pipeline ..................................... 69 Chapter 4: The Vertex Function............................................ 97 Chapter 5: 3D Transformations ......................................... 117 Chapter 6: Coordinate Spaces ............................................ 134 Chapter 7: The Fragment Function ................................... 156 Chapter 8: Textures ................................................................. 179 Chapter 9: Navigating a 3D Scene .................................... 212 Chapter 10: Lighting Fundamentals ................................. 239
Section II: Intermediate Metal .................................... 267 Chapter 11: Maps & Materials ........................................... 268 Chapter 12: Render Passes .................................................. 300 Chapter 13: Shadows ............................................................. 321 raywenderlich.com
3
Metal by Tutorials
Chapter 14: Deferred Rendering ...................................... 343 Chapter 15: Tile-Based Deferred Rendering................ 371 Chapter 16: GPU Compute Programming ..................... 395 Chapter 17: Particle Systems .............................................. 411 Chapter 18: Particle Behavior ............................................ 437
Section III: Advanced Metal ......................................... 458 Chapter 19: Tessellation & Terrains ................................. 459 Chapter 20: Fragment Post-Processing.......................... 491 Chapter 21: Imaged-Based Lighting ................................. 510 Chapter 22: Reflection & Refraction ................................ 544 Chapter 23: Animation .......................................................... 568 Chapter 24: Character Animation..................................... 593 Chapter 25: Managing Resources ..................................... 619 Chapter 26: GPU-Driven Rendering ................................ 637
Section IV: Ray Tracing ................................................... 665 Chapter 27: Rendering With Rays .................................... 666 Chapter 28: Advanced Shadows ........................................ 694 Chapter 29: Advanced Lighting.......................................... 722 Chapter 30: Metal Performance Shaders ...................... 738 Chapter 31: Performance Optimization ......................... 755 Chapter 32: Best Practices .................................................. 777 Conclusion .............................................................................................. 788
raywenderlich.com
4
Metal by Tutorials
Table of Contents: Extended Book License . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Before You Begin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 What You Need . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Book Source Code & Forums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 About the Authors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 About the Editors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 About This Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . How Did Metal Come to Life?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Why Would You Use Metal? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . When Should You Use Metal? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Who This Book Is For? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . How to Read This Book? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
24 25 25 26 27 27
Section I: Beginning Metal . . . . . . . . . . . . . . . . . . . . . . . . . 28 Chapter 1: Hello, Metal! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 What is Rendering? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . What is a Frame? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Your First Metal App . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Metal View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rendering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Challenge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Points. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
30 32 32 33 38 42 43
Chapter 2: 3D Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 What Are 3D Models? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Creating Models With Blender . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 raywenderlich.com
5
Metal by Tutorials
3D File Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exporting to Blender . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The .obj File Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The .mtl File Format. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Material Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Vertex Descriptors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Metal Coordinate System. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Submeshes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Challenge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Points. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
50 51 53 54 57 59 62 65 67 68
Chapter 3: The Rendering Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . 69 The GPU and CPU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Metal Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Render Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Challenge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Points. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
70 71 81 95 96
Chapter 4: The Vertex Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 Shader Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 The Starter Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 Rendering a Quad . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 Calculating Positions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 More Efficient Rendering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 Vertex Descriptors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 Adding Another Vertex Attribute . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 Rendering Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 Challenge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 Key Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
Chapter 5: 3D Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 The Starter Project & Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 raywenderlich.com
6
Metal by Tutorials
Translation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Vectors & Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Creating a Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
122 122 125 127 129 133
Chapter 6: Coordinate Spaces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 The Starter Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Uniforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Projection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Refactoring the Model Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Where to Go From Here?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
139 141 146 152 155 155
Chapter 7: The Fragment Function . . . . . . . . . . . . . . . . . . . . . . . . . 156 The Starter Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Screen Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Metal Standard Library Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Normals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Loading the Train Model With Normals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Depth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hemispheric Lighting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Challenge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Where to Go From Here?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
157 158 161 167 169 172 175 177 178 178
Chapter 8: Textures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 Textures and UV Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Starter App . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . sRGB Color Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Capture GPU Workload . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Samplers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . raywenderlich.com
180 183 190 192 195 7
Metal by Tutorials
Mipmaps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Asset Catalog . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Texture Compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
200 204 209 211
Chapter 9: Navigating a 3D Scene . . . . . . . . . . . . . . . . . . . . . . . . . . 212 The Starter Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Scenes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cameras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Delta Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mouse and Trackpad Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Arcball Camera . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Orthographic Projection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Challenge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
213 213 215 219 221 227 228 232 237 238
Chapter 10: Lighting Fundamentals . . . . . . . . . . . . . . . . . . . . . . . . 239 The Starter Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Representing Color . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Normals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Light Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Directional Light. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Phong Reflection Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Dot Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Creating Shared Functions in C++ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Point Lights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Spotlights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Where to Go From Here?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
240 241 243 244 244 247 248 252 260 263 266 266
Section II: Intermediate Metal . . . . . . . . . . . . . . . . . . . . 267 Chapter 11: Maps & Materials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268 raywenderlich.com
8
Metal by Tutorials
Normal Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Starter App . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Using Normal Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Materials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Physically Based Rendering (PBR) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Channel Packing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Challenge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Where to Go From Here?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
269 275 275 285 291 296 298 299
Chapter 12: Render Passes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300 Render Passes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Object Picking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Starter App . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Setting up Render Passes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Creating a UInt32 Texture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Adding the Render Pass to Renderer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Adding the Shader Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Adding the Depth Attachment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Load & Store Actions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Reading the Object ID Texture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
301 302 303 304 305 308 309 312 315 316 320
Chapter 13: Shadows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321 Shadow Maps. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Starter Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Identifying Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Visualizing the Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solving the Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cascaded Shadow Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
322 323 334 336 339 341 342
Chapter 14: Deferred Rendering . . . . . . . . . . . . . . . . . . . . . . . . . . . 343 The Starter Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346 raywenderlich.com
9
Metal by Tutorials
The G-buffer Pass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Updating Renderer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Lighting Shader Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Adding Point Lights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Blending . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
347 357 358 360 367 370
Chapter 15: Tile-Based Deferred Rendering . . . . . . . . . . . . . . . 371 Programmable Blending . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tiled Deferred Rendering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Starter Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Stencil Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Create the Stencil Texture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Configure the Stencil Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Masking the Sky . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Challenge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Where to Go From Here?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
373 373 374 384 387 388 392 394 394 394
Chapter 16: GPU Compute Programming. . . . . . . . . . . . . . . . . . 395 The Starter Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Winding Order and Culling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Reversing the Model on the CPU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Compute Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Reversing the Warrior Using GPU Compute Processing . . . . . . . . . . . . The Kernel Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Atomic Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
396 397 398 400 404 406 408 410
Chapter 17: Particle Systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411 Particle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412 Emitter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413 The Starter Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413 raywenderlich.com
10
Metal by Tutorials
Creating a Particle and Emitter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Fireworks Pass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Particle Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Implementing Particle Physics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Particle Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rendering a Particle System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Configuring Particle Effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fire . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Where to Go From Here?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
414 417 421 422 423 426 432 434 436 436
Chapter 18: Particle Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437 Behavioral Animation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Swarming Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Starter Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Velocity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Behavioral Rules. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Where to Go From Here?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
438 439 440 443 445 457 457
Section III: Advanced Metal . . . . . . . . . . . . . . . . . . . . . . 458 Chapter 19: Tessellation & Terrains . . . . . . . . . . . . . . . . . . . . . . . . 459 Tessellation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Starter Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Tessellation Kernel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Multiple Patches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tessellation By Distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Displacement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shading By Slope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Challenge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Where to Go From Here?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . raywenderlich.com
460 460 465 472 474 479 485 488 489 490 11
Metal by Tutorials
Chapter 20: Fragment Post-Processing . . . . . . . . . . . . . . . . . . . . 491 The Starter App . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Alpha Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Depth Testing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Stencil Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Scissor Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Alpha Blending . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Opacity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Blending . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Antialiasing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fog . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Where to Go From Here?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
492 493 495 495 496 497 498 499 505 507 509 509
Chapter 21: Imaged-Based Lighting . . . . . . . . . . . . . . . . . . . . . . . . 510 The Starter Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Skybox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Procedural Skies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Reflection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Image-Based Lighting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Challenge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Where to Go From Here?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
511 511 518 525 528 542 543 543
Chapter 22: Reflection & Refraction . . . . . . . . . . . . . . . . . . . . . . . 544 The Starter Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rendering Rippling Water . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1. Creating the Water Surface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2. Rendering the Reflection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3. Creating Clipping Planes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4. Rippling Normal Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5. Adding Refraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6. The Fresnel Effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . raywenderlich.com
545 546 547 549 555 557 560 563 12
Metal by Tutorials
7. Adding Smoothness Using a Depth Texture . . . . . . . . . . . . . . . . . . . . . . 565 Key Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 567 Where to Go From Here?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 567
Chapter 23: Animation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 568 The Starter Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Animation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Procedural Animation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Animation Using Physics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Keyframes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Quaternions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . USD and USDZ Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Animating Meshes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Challenge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
569 569 570 571 574 581 585 586 592 592
Chapter 24: Character Animation . . . . . . . . . . . . . . . . . . . . . . . . . . 593 Skeletal Animation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Starter App . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Implementing Skeletal Animation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Loading the Animation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Joint Matrix Palette. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Inverse Bind Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Updating the Vertex Shader . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Function Specialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Where to Go From Here?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
594 600 601 604 605 607 611 613 617 618
Chapter 25: Managing Resources . . . . . . . . . . . . . . . . . . . . . . . . . . 619 The Starter Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Argument Buffers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Resource Heaps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . raywenderlich.com
620 622 629 636 13
Metal by Tutorials
Chapter 26: GPU-Driven Rendering . . . . . . . . . . . . . . . . . . . . . . . 637 The Starter Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Indirect Command Buffers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . GPU-Driven Rendering. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Challenge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Where to Go From Here?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
638 639 651 663 663 664
Section IV: Ray Tracing. . . . . . . . . . . . . . . . . . . . . . . . . . . . 665 Chapter 27: Rendering With Rays . . . . . . . . . . . . . . . . . . . . . . . . . . 666 Getting Started . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ray Casting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ray Tracing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Path Tracing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Raymarching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Signed Distance Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Starter Playground . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Using a Signed Distance Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Raymarching Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Creating Random Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Marching Clouds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
667 668 670 671 673 675 676 676 677 684 691 693
Chapter 28: Advanced Shadows . . . . . . . . . . . . . . . . . . . . . . . . . . . 694 The Starter Playground . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hard Shadows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Soft Shadows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ambient Occlusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Where to Go From Here?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
695 695 703 712 721 721
Chapter 29: Advanced Lighting . . . . . . . . . . . . . . . . . . . . . . . . . . . . 722 The Rendering Equation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 723 raywenderlich.com
14
Metal by Tutorials
Reflection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Getting Started . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Refraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Raytraced Water . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Where to Go From Here?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
724 725 731 735 737 737
Chapter 30: Metal Performance Shaders . . . . . . . . . . . . . . . . . . 738 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Sobel Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Image Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Starter Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Blit Command Encoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Gaussian Blur . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Matrix / Vector Mathematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Challenge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
739 739 740 744 746 747 750 754 754
Chapter 31: Performance Optimization. . . . . . . . . . . . . . . . . . . . 755 The Starter App . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Profiling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . GPU Workload Capture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . GPU Timeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Instancing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Removing Duplicate Textures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CPU-GPU Synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
756 756 758 764 768 770 771 776
Chapter 32: Best Practices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 777 General Performance Best Practices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Memory Bandwidth Best Practices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Memory Footprint Best Practices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Where to Go From Here?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . raywenderlich.com
778 780 784 787 15
Metal by Tutorials
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 788
raywenderlich.com
16
L
Book License
By purchasing Metal by Tutorials, you have the following license: • You are allowed to use and/or modify the source code in Metal by Tutorials in as many apps as you want, with no attribution required. • You are allowed to use and/or modify all art, images and designs that are included in Metal by Tutorials in as many apps as you want, but must include this attribution line somewhere inside your app: “Artwork/images/designs: from Metal by Tutorials, available at www.raywenderlich.com”. • The source code included in Metal by Tutorials is for your personal use only. You are NOT allowed to distribute or sell the source code in Metal by Tutorials without prior authorization. • This book is for your personal use only. You are NOT allowed to sell this book without prior authorization, or distribute it to friends, coworkers or students; they would need to purchase their own copies. All materials provided with this book are provided on an “as is” basis, without warranty of any kind, express or implied, including but not limited to the warranties of merchantability, fitness for a particular purpose and noninfringement. In no event shall the authors or copyright holders be liable for any claim, damages or other liability, whether in an action of contract, tort or otherwise, arising from, out of or in connection with the software or the use or other dealings in the software. All trademarks and registered trademarks appearing in this guide are the properties of their respective owners.
raywenderlich.com
17
Before You Begin
This section tells you a few things you need to know before you get started, such as what you’ll need for hardware and software, where to find the project files for this book and more.
raywenderlich.com
18
i
What You Need
To follow along with the tutorials in this book, you need the following: • A Metal-capable Mac running macOS Monterey 12.0 or later. All Macs built since 2012 should run Metal, although not all of them will be able to run the most recent features in Metal 2. Nvidia GPUs will have issues, in some cases serious, as drivers have not been updated since macOS High Sierra. One chapter includes Apple GPU-specific code, for which you’ll need either an M1 GPU or a recent iPhone or iPad. • Xcode 13.3 or later. • [optional] A Metal-capable iPhone or iPad running iOS 15 or later. Any iOS device running the A7 chip or later will run Metal. The latest features, such as tile shading and imageblocks, will only run on the A11 (or later) chipset. The projects will build and run on macOS, and most of them will run on the iOS Simulator, so using an iOS device is optional. If you wish to make an iOS game, the game engine you build while reading this book will have an iOS target as well.
raywenderlich.com
19
ii
Book Source Code & Forums
Where to download the materials for this book The materials for this book can be cloned or downloaded from the GitHub book materials repository: • https://github.com/raywenderlich/met-materials/tree/editions/3.0
Forums We’ve also set up an official forum for the book at https:// forums.raywenderlich.com/c/books/metal-by-tutorials. This is a great place to ask questions about the book or to submit any errors you may find.
raywenderlich.com
20
“To Warren Moore, who first made it possible for me to learn Metal, to my wonderful children Robin and Kayla, and to my best friends who patiently waited for me to indulge my dream.” — Caroline Begbie “To my wife, Adina, and my son, Victor Nicholas, without whose patience, support and understanding I could not have made it. To Warren Moore who first whet my appetite for Metal, offered his advice when needed and motivated me to get involved with Metal too. To Chris Wood who taught me that most of the times all you need to render is a ray, a camera and a few distance fields. To Simon Gladman whose amazing work with compute kernels inspired me to write more about particles and fluid dynamics. To Jeff Biggus who keeps the GPU programming community in Chicago alive. Our daily conversations motivate me to stay hungry for more. To everyone else who believes in me. A huge Thanks to all of you!” — Marius Horga
raywenderlich.com
21
Metal by Tutorials
About the Team
About the Authors Caroline Begbie is a co-author of this book. Caroline is an indie iOS developer. When she’s not developing, she’s playing around with 2D and 3D animation software, or planning The Big Lap around Australia. She has previously taught the elderly how to use their computers, done marionette shows for pre-schools, and created accounting and stock control systems for mining companies. Marius Horga is a co-author of this book. Marius is an iOS developer and Metal API blogger. He is also a computer scientist. He has more than a decade of experience with systems, support, integration and development. You can often see him on Twitter talking about Metal, GPGPU, games and 3D graphics. When he’s away from computers, he enjoys music, biking or stargazing.
About the Editors Adrian Strahan is the technical editor of this book. Adrian is a lead iOS developer working for a leading UK bank. When he’s not sat in front of a computer, he enjoys watching sport, listening to music and trying to keep fit and healthy.
Tammy Coron is the final pass editor of this book. Tammy is an independent creative professional, author of Apple Game Frameworks and Technologies, and the maker behind the AdventureGameKit — a custom SpriteKit framework for building point and click adventure games. Find out more at tammycoron.com.
raywenderlich.com
22
v
Acknowledgments
Many of the models and images used in this book were made by the authors and the raywenderlich.com team. In some cases, the authors used public domain or CC By 4.0, commercial allowed models and images and included links and licences in either the projects’ Models folder or references.markdown for the chapter.
raywenderlich.com
23
vi Introduction
Welcome to Metal by Tutorials! Metal is a unified, low-level, low-overhead application programming interface (API) for the graphics processing unit, or GPU. It’s unified because it applies to both 3D graphics and data-parallel computation paradigms. Metal is a low-level API because it provides programmers near-direct access to the GPU. Finally, Metal is a lowoverhead API because it reduces the runtime cost by multi-threading and precompiling of resources. But beyond the technical definition, Metal is the most appropriate way to use the GPU’s parallel processing power to visualize data or solve numerical challenges. It’s also tailored to be used for machine learning, image/video processing or, as this book describes, graphics rendering.
About This Book This book introduces you to low-level graphics programming in Metal — Apple’s framework for programming on the graphics processing unit (GPU). As you progress through this book, you’ll learn many of the fundamentals that go into making a game engine and gradually put together your own engine. Once your game engine is complete, you’ll be able to put together 3D scenes and program your own simple 3D games. Because you’ll have built your 3D game engine from scratch, you’ll be able to customize every aspect of what you see on your screen.
raywenderlich.com
24
Metal by Tutorials
Introduction
How Did Metal Come to Life? Historically, you had two choices to take advantage of the power of the GPU: OpenGL and the Windows-only DirectX. In 2013, the GPU vendor AMD announced the Mantle project in an effort to revamp GPU APIs and come up with an alternative to Direct3D (which is part of DirectX) and OpenGL. AMD were the first to create a true lowoverhead API for low-level access to the GPU. Mantle promised to be able to generate up to 9 times more draw calls (the number of objects drawn to the screen) than similar APIs and also introduced asynchronous command queues so that graphics and compute workloads could be run in parallel. Unfortunately, the project was terminated before it could become a mainstream API. Metal was announced at the Worldwide Developers Conference (WWDC) on June 2, 2014 and was initially made available only on A7 or newer GPUs. Apple created a new language to program the GPU directly via shader functions. This is the Metal Shading Language (MSL) based on the C++11 specification. A year later at WWDC 2015, Apple announced two Metal sub-frameworks: MetalKit and Metal Performance Shaders (MPS). In 2018, MPS made a spectacular debut as a Ray Tracing accelerator. The API continues to evolve to work with the exciting features of the new Apple GPUS designed in-house by Apple. Metal 2 adds support for Virtual Reality (VR), Augmented Reality (AR) and accelerated machine learning (ML), among many new features, including image blocks, tile shading and threadgroup sharing. MSL was also updated to version 2.0 in Fall 2017 and is now based on the C++14 specification.
Why Would You Use Metal? Metal is a top-notch graphics API. That means Metal can empower graphics pipelines and, more specifically, game engines such as the following: • Unity and Unreal Engine: The two leading cross-platform game engines today are ideal for game programmers who target a range of console, desktop and mobile devices. However, these engines haven’t always kept pace with new features in Metal. For example, Unity announced that tessellation on iOS was to be released in 2018, despite it being demonstrated live at WWDC 2016. If you to use cutting-edge Metal developments, you can’t always depend on third-party engines.
raywenderlich.com
25
Metal by Tutorials
Introduction
• Divinity - Original Sin 2: Larian Studios worked closely with Apple to bring their amazing AAA game to iPad, taking advantage of Metal and the Apple GPU hardware. It truly is a stunning visual experience. • The Witness: This award-winning puzzle game has a custom engine that runs on top of Metal. By taking advantage of Metal, the iPad version is every bit as stunning as the desktop version and is highly recommended for puzzle game fans. • Many Others: From notable game titles such as Hitman, BioShock, Deus Ex, Mafia, Starcraft, World of Warcraft, Fortnite, Unreal Tournament, Batman and even the beloved Minecraft. But Metal isn’t limited to the world of gaming. There are many apps that benefit from GPU acceleration for image and video processing: • Procreate: An app for sketching, painting and illustrating. Since converting to Metal, it runs four times faster than it did before. • Pixelmator: A Metal-based app that provides image distortion tools. In fact, they were able to implement a new painting engine and dynamic paint blending technology powered by Metal 2. • Affinity Photo: Available on the iPad. According to the developer Serif, “Using Metal allows users to work easily on large, super high-resolution photographs, or complex compositions with potentially thousands of layers.” Metal, and in particular, the MPS sub-framework, is incredibly useful in the realm of machine and deep learning on convolutional neural networks. Apple presented a practical machine learning application at WWDC 2016 that demonstrated the power of CNNs in high-precision image recognition.
When Should You Use Metal? GPUs belong to a special class of computation that Flynn’s taxonomy terms Single Instruction Multiple Data (SIMD). Simply, GPUs are processors that are optimized for throughput (how much data can be processed in one unit of time), while CPUs are optimized for latency (how much time it takes a single unit of data to be processed). Most programs execute serially: they receive input, process it, provide output and then the cycle repeats. Those cycles sometimes perform computationally-intensive tasks, such as large matrix multiplication, which would take CPUs a lot of time process serially, even in a multithreaded manner on a handful of cores. raywenderlich.com
26
Metal by Tutorials
Introduction
In contrast, GPUs have hundreds or even thousands of cores which are smaller and have less memory than CPU cores, but perform fast parallel mathematical calculations. Choose Metal when: • You want to render 3D models as efficiently as possible. • You want your game to have its own unique style, perhaps with custom lighting and shading. • You will be performing intensive data processes, such as calculating and changing the color of each pixel on the screen every frame, as you would when processing images and video. • You have large numerical problems, such as scientific simulations, that you can partition into independent sub-problems to be processed in parallel. • You need to process multiple large datasets in parallel, such as when you train models for deep learning.
Who This Book Is For? This book is for intermediate Swift developers interested in learning 3D graphics or gaining a deeper understanding of how game engines work. If you don’t know Swift, you can still follow along, as all the code instructions are included in the book. You’ll gain general graphics knowledge, but it would be less confusing if you cover Swift basics first. We recommend the Swift Apprentice book, available in our catlagoue: https://www.raywenderlich.com/books/swift-apprentice A smattering of C++ knowledge would be useful too. The Metal Shader Language that you’ll use when writing GPU shader functions is based on C++. But, again, all the code you’ll need is included in the book.
How to Read This Book? If you’re a beginner to iOS/macOS development or Metal, you should read this book from cover to cover. If you’re an advanced developer, or already have experience with Metal, you can skip from chapter to chapter or use this book as a reference. raywenderlich.com
27
Section I: Beginning Metal
It takes a wealth of knowledge to render a simple triangle on the screen or animate game characters. This section will guide you through the necessary basics of vertex wrangling, lighting, textures and creating a game scene. If you’re worried about the math, don’t be! Although computer graphics is highly math-intensive, each chapter explains everything you need, and you’ll get experience creating and rendering models.
raywenderlich.com
28
1
Chapter 1: Hello, Metal!
You’ve been formally introduced to Metal and discovered its history and why you should use it. Now you’re going to to try it out for yourself in a Swift playground. To get started, you’ll render this sphere on the screen:
The final result It may not look exciting, but this is a great starting point because it lets you touch on almost every part of the rendering process. But before you get started, it’s important to understand the terms rendering and frames.
raywenderlich.com
29
Metal by Tutorials
Chapter 1: Hello, Metal!
What is Rendering? In 3D computer graphics, you take a bunch of points, join them together and create an image on the screen. This image is known as a render. Rendering an image from points involves calculating light and shade for each pixel on the screen. Light bounces around a scene, so you have to decide how complicated your lighting is and how long each image takes to render. A single image in a Pixar movie might take days to render, but games require real-time rendering, where you see the image immediately. There are many ways to render a 3D image, but most start with a model built in a modeling app such as Blender or Maya. Take, for example, this train model that was built in Blender:
A train model in Blender This model, like all other models, is made up of vertices. A vertex refers to a point in three dimensional space where two or more lines, curves or edges of a geometrical shape meet, such as the corners of a cube. The number of vertices in a model may vary from a handful, as in a cube, to thousands or even millions in more complex models. A 3D renderer will read in these vertices using model loader code, which parses the list of vertices. The renderer then passes the vertices to the GPU, where shader functions process the vertices to create the final image or texture to be sent back to the CPU and displayed on the screen.
raywenderlich.com
30
Metal by Tutorials
Chapter 1: Hello, Metal!
The following render uses the 3D train model and some different shading techniques to make it appear as if the train were made of shiny copper:
Shading techniques cause reflection The entire process, from importing a model’s vertices to generating the final image on your screen, is commonly known as the rendering pipeline. The rendering pipeline is a list of commands sent to the GPU, along with resources (vertices, materials and lights) that make up the final image. The pipeline includes programmable and non-programmable functions. The programmable parts of the pipeline, known as vertex functions and fragment functions, are where you can manually influence the final look of your rendered models. You’ll learn more about each later in the book.
raywenderlich.com
31
Metal by Tutorials
Chapter 1: Hello, Metal!
What is a Frame? A game wouldn’t be much fun if all it did was render a single still image. Moving a character around the screen in a fluid manner requires the GPU to render a still image roughly sixty times a second. Each still image is known as a frame, and the speed at which the images appear is known as the frame rate. When your favorite game appears to stutter, it’s usually because of a decrease in the frame rate, especially if there’s an excessive amount of background processing eating away at the GPU. When designing a game, it’s important to balance the result you want with what the hardware can deliver. While it might be cool to add real-time shadows, water reflections and millions of blades of animated grass — all of which you’ll learn how to do in this book — finding the right balance between what is possible and what the GPU can process in 1/60th of a second can be tough.
Your First Metal App In your first Metal app, the shape you’ll render will look more like a flat circle than a 3D sphere. That’s because your first model will not include any perspective or shading. However, its vertex mesh contains the full three-dimensional information. The process of Metal rendering is much the same no matter the size and complexity of your app, and you’ll become very familiar with the following sequence of drawing your models on the screen:
You may initially feel a little overwhelmed by the number of steps Metal requires, but don’t worry. You’ll always perform these steps in the same sequence, and they’ll gradually become second nature. This chapter won’t go into detail on every step, but as you progress through the book, you’ll get more information as you need it. For now, concentrate on getting your first Metal app running.
raywenderlich.com
32
Metal by Tutorials
Chapter 1: Hello, Metal!
Getting Started ➤ Start Xcode, and create a new playground by selecting File ▸ New ▸ Playground… from the main menu. When prompted for a template, choose macOS Blank.
The playground template ➤ Name the playground Chapter1, and click Create. ➤ Next, delete everything in the playground.
The Metal View Now that you have a playground, you’ll create a view to render into. ➤ Import the two main frameworks that you’ll be using by adding this: import PlaygroundSupport import MetalKit PlaygroundSupport lets you see live views in the assistant editor, and MetalKit is a framework that makes using Metal easier. MetalKit has a customized view named MTKView and many convenience methods for loading textures, working with Metal
buffers and interfacing with another useful framework: Model I/O, which you’ll learn about later. raywenderlich.com
33
Metal by Tutorials
Chapter 1: Hello, Metal!
➤ Now, add this: guard let device = MTLCreateSystemDefaultDevice() else { fatalError("GPU is not supported") }
This code checks for a suitable GPU by creating a device: Note: Are you getting an error? If you accidentally created an iOS playground instead of a macOS playground, you’ll get a fatal error because the iOS simulator is not supported. ➤ To set up the view, add this: let frame = CGRect(x: 0, y: 0, width: 600, height: 600) let view = MTKView(frame: frame, device: device) view.clearColor = MTLClearColor(red: 1, green: 1, blue: 0.8, alpha: 1)
This code configures an MTKView for the Metal renderer. MTKView is a subclass of NSView on macOS and of UIView on iOS. MTLClearColor represents an RGBA value — in this case, cream. The color value is stored in clearColor and is used to set the color of the view.
The Model Model I/O is a framework that integrates with Metal and SceneKit. Its main purpose is to load 3D models that were created in apps like Blender or Maya, and to set up data buffers for easier rendering. Instead of loading a 3D model, you’re going to load a Model I/O basic 3D shape, also called a primitive. A primitive is typically considered a cube, a sphere, a cylinder or a torus. ➤ Add this code to the end of the playground: // 1 let allocator = MTKMeshBufferAllocator(device: device) // 2 let mdlMesh = MDLMesh(sphereWithExtent: [0.75, 0.75, 0.75], segments: [100, 100], inwardNormals: false, geometryType: .triangles, allocator: allocator) // 3 let mesh = try MTKMesh(mesh: mdlMesh, device: device)
raywenderlich.com
34
Metal by Tutorials
Chapter 1: Hello, Metal!
Going through the code: 1. The allocator manages the memory for the mesh data. 2. Model I/O creates a sphere with the specified size and returns an MDLMesh with all the vertex information in data buffers. 3. For Metal to be able to use the mesh, you convert it from a Model I/O mesh to a MetalKit mesh.
Queues, Buffers and Encoders Each frame consists of commands that you send to the GPU. You wrap up these commands in a render command encoder. Command buffers organize these command encoders and a command queue organizes the command buffers.
➤ Add this code to create a command queue: guard let commandQueue = device.makeCommandQueue() else { fatalError("Could not create a command queue") }
You should set up the device and the command queue at the start of your app, and generally, you should use the same device and command queue throughout.
raywenderlich.com
35
Metal by Tutorials
Chapter 1: Hello, Metal!
On each frame, you’ll create a command buffer and at least one render command encoder. These are lightweight objects that point to other objects, such as shader functions and pipeline states, that you set up only once at the start of the app.
Shader Functions Shader functions are small programs that run on the GPU. You write these programs in the Metal Shading Language, which is a subset of C++. Normally, you’d create a separate file with a .metal extension specifically for shader functions but for now, create a multi-line string containing the shader function code, and add it to your playground: let shader = """ #include using namespace metal; struct VertexIn { float4 position [[attribute(0)]]; }; vertex float4 vertex_main(const VertexIn vertex_in [[stage_in]]) { return vertex_in.position; } fragment float4 fragment_main() { return float4(1, 0, 0, 1); } """
raywenderlich.com
36
Metal by Tutorials
Chapter 1: Hello, Metal!
There are two shader functions in here: a vertex function named vertex_main and a fragment function named fragment_main. The vertex function is where you usually manipulate vertex positions and the fragment function is where you specify the pixel color. To set up a Metal library containing these two functions, add the following: let library = try device.makeLibrary(source: shader, options: nil) let vertexFunction = library.makeFunction(name: "vertex_main") let fragmentFunction = library.makeFunction(name: "fragment_main")
The compiler will check that these functions exist and make them available to a pipeline descriptor.
The Pipeline State In Metal, you set up a pipeline state for the GPU. By setting up this state, you’re telling the GPU that nothing will change until the state changes. With the GPU in a fixed state, it can run more efficiently. The pipeline state contains all sorts of information that the GPU needs, such as which pixel format it should use and whether it should render with depth. The pipeline state also holds the vertex and fragment functions that you just created. However, you don’t create a pipeline state directly, rather you create it through a descriptor. This descriptor holds everything the pipeline needs to know, and you only change the necessary properties for your particular rendering situation. ➤ Add this code: let pipelineDescriptor = MTLRenderPipelineDescriptor() pipelineDescriptor.colorAttachments[0].pixelFormat = .bgra8Unorm pipelineDescriptor.vertexFunction = vertexFunction pipelineDescriptor.fragmentFunction = fragmentFunction
Here, you’ve specified the pixel format to be 32 bits with color pixel order of blue/ green/red/alpha. You also set the two shader functions.
raywenderlich.com
37
Metal by Tutorials
Chapter 1: Hello, Metal!
You’ll describe to the GPU how the vertices are laid out in memory using a vertex descriptor. Model I/O automatically creates a vertex descriptor when it loads the sphere mesh, so you can just use that one. ➤ Add this code: pipelineDescriptor.vertexDescriptor = MTKMetalVertexDescriptorFromModelIO(mesh.vertexDescriptor)
You’ve now set up the pipeline descriptor with the necessary information. MTLRenderPipelineDescriptor has many other properties, but for now, you’ll use the defaults. ➤ Now, add this code: let pipelineState = try device.makeRenderPipelineState(descriptor: pipelineDescriptor)
This code creates the pipeline state from the descriptor. Creating a pipeline state takes valuable processing time, so all of the above should be a one-time setup. In a real app, you might create several pipeline states to call different shading functions or use different vertex layouts.
Rendering From now on, the code should be performed every frame. MTKView has a delegate method that runs every frame, but as you’re doing a simple render which will simply fill out a static view, you don’t need to keep refreshing the screen every frame. When performing graphics rendering, the GPU’s ultimate job is to output a single texture from a 3d scene. This texture is similar to the digital image created by a physical camera. The texture will be displayed on the device’s screen each frame.
Render Passes If you’re trying to achieve a realistic render, you’ll want to take into account shadows, lighting and reflections. Each of these takes a lot of calculation and is generally done in separate render passes. For example, a shadow render pass will render the entire scene of 3D models, but only retain grayscale shadow information.
raywenderlich.com
38
Metal by Tutorials
Chapter 1: Hello, Metal!
A second render pass would render the models in full color. You can then combine the shadow and color textures to produce the final output texture that will go to the screen.
For the first part of this book, you’ll use a single render pass. Later, you’ll learn about multipass rendering. Conveniently, MTKView provides a render pass descriptor that will hold a texture called the drawable. ➤ Add this code to the end of the playground: // 1 guard let commandBuffer = commandQueue.makeCommandBuffer(), // 2 let renderPassDescriptor = view.currentRenderPassDescriptor, // 3 let renderEncoder = commandBuffer.makeRenderCommandEncoder( descriptor: renderPassDescriptor) else { fatalError() }
Here’s what’s happening: 1. You create a command buffer. This stores all the commands that you’ll ask the GPU to run. 2. You obtain a reference to the view’s render pass descriptor. The descriptor holds data for the render destinations, known as attachments. Each attachment needs information, such as a texture to store to, and whether to keep the texture throughout the render pass. The render pass descriptor is used to create the render command encoder. 3. From the command buffer, you get a render command encoder using the render pass descriptor. The render command encoder holds all the information necessary to send to the GPU so that it can draw the vertices.
raywenderlich.com
39
Metal by Tutorials
Chapter 1: Hello, Metal!
If the system fails to create a Metal object, such as the command buffer or render encoder, that’s a fatal error. The view’s currentRenderPassDescriptor may not be available in a particular frame, and usually you’ll just return from the rendering delegate method. Because you’re asking for it only once in this playground, you get a fatal error. ➤ Add the following code: renderEncoder.setRenderPipelineState(pipelineState)
This code gives the render encoder the pipeline state that you set up earlier. The sphere mesh that you loaded earlier holds a buffer containing a simple list of vertices. ➤ Give this buffer to the render encoder by adding the following code: renderEncoder.setVertexBuffer( mesh.vertexBuffers[0].buffer, offset: 0, index: 0)
The offset is the position in the buffer where the vertex information starts. The index is how the GPU vertex shader function locates this buffer.
Submeshes The mesh is made up of submeshes. When artists create 3D models, they design them with different material groups. These translate to submeshes. For example, if you were rendering a car object, you might have a shiny car body and rubber tires. One material is shiny paint and another is rubber. On import, Model I/O creates two different submeshes that index to the correct vertices for that group. One vertex can be rendered multiple times by different submeshes. This sphere only has one submesh, so you’ll use only one. ➤ Add this code: guard let submesh = mesh.submeshes.first else { fatalError() }
Now for the exciting part: drawing! You draw in Metal with a draw call.
raywenderlich.com
40
Metal by Tutorials
Chapter 1: Hello, Metal!
➤ Add this code: renderEncoder.drawIndexedPrimitives( type: .triangle, indexCount: submesh.indexCount, indexType: submesh.indexType, indexBuffer: submesh.indexBuffer.buffer, indexBufferOffset: 0)
Here, you’re instructing the GPU to render a vertex buffer consisting of triangles with the vertices placed in the correct order by the submesh index information. This code does not do the actual render — that doesn’t happen until the GPU has received all the command buffer’s commands. ➤ To complete sending commands to the render command encoder and finalize the frame, add this code: // 1 renderEncoder.endEncoding() // 2 guard let drawable = view.currentDrawable else { fatalError() } // 3 commandBuffer.present(drawable) commandBuffer.commit()
Going through the code: 1. You tell the render encoder that there are no more draw calls and end the render pass. 2. You get the drawable from the MTKView. The MTKView is backed by a Core Animation CAMetalLayer and the layer owns a drawable texture which Metal can read and write to. 3. Ask the command buffer to present the MTKView’s drawable and commit to the GPU. ➤ Finally, add this code to the end of the playground: PlaygroundPage.current.liveView = view
raywenderlich.com
41
Metal by Tutorials
Chapter 1: Hello, Metal!
With that line of code, you’ll be able to see the Metal view in the Assistant editor. ➤ Run the playground, and in the playground’s live view, you’ll see a red sphere on a cream background.
Note: Sometimes playgrounds don’t compile or run when they should. If you’re sure you’ve written the code correctly, then restart Xcode and reload the playground. Wait for a second or two before running. Congratulations! You’ve written your first Metal app, and you’ve also used many of the Metal API commands that you’ll use in every Metal app you write.
Challenge Where you created the initial sphere mesh, experiment with setting the sphere to different sizes. For example, change the size from: [0.75, 0.75, 0.75]
raywenderlich.com
42
Metal by Tutorials
Chapter 1: Hello, Metal!
To: [0.2, 0.75, 0.2]
Change the color of the sphere. In the shader function string, you’ll see: return float4(1, 0, 0, 1);
This code returns red=1, green=0, blue=0, alpha=1, which results in the red color. Try changing the numbers (from zero to 1) for a different color. Try this green, for example: return float4(0, 0.4, 0.21, 1);
In the next chapter, you’ll examine 3D models up close in Blender. Then continuing in your Swift Playground, you’ll import and render a train model.
Key Points • Rendering means to create an image from three-dimensional points. • A frame is an image that the GPU renders sixty times a second (optimally). • device is a software abstraction for the hardware GPU. • A 3D model consists of a vertex mesh with shading materials grouped in submeshes. • Create a command queue at the start of your app. This action organizes the command buffer and command encoders that you’ll create every frame. • Shader functions are programs that run on the GPU. You position vertices and color the pixels in these programs. • The render pipeline state fixes the GPU into a particular state. It can set which shader functions the GPU should run and how vertex layouts are formatted. Learning computer graphics is difficult. The Metal API is modern, and it takes a lot of pain out of the learning, but you need to know a lot of information up-front. Even if you feel overwhelmed at the moment, continue with the next chapters. Repetition will help with your understanding.
raywenderlich.com
43
2
Chapter 2: 3D Models
Do you know what makes a good game even better? Gorgeous graphics! Creating amazing graphics — like those in Divinity: Original Sin 2, Diablo 3 and The Witcher 3 — generally requires a team of programmers and 3D artists who are fairly skilled at what they do. The graphics you see onscreen are created using 3D models that are rendered with custom renderers, similar to the one you wrote in the previous chapter, only more advanced. Nevertheless, the principle of rendering 3D models is still the same. In this chapter, you’ll learn all about 3D models, including how to render them onscreen, and how to work with them in Blender.
raywenderlich.com
44
Metal by Tutorials
Chapter 2: 3D Models
What Are 3D Models? 3D models are made up of vertices. Each vertex refers to a point in 3D space using x, y and z values.
A vertex in 3D space. As you saw in the previous chapter, you send these vertex points to the GPU for rendering. You need three vertices to create a triangle, and GPUs are able to render triangles efficiently. To show smaller details, a 3D model may also use textures. You’ll learn more about textures in Chapter 8, “Textures”. ➤ Open the starter playground for this chapter. This playground contains the train model in two formats (.obj and .usd), as well as two pages (Render and Export 3D Model and Import Train). If you don’t see these items, you may need to hide/show the Project navigator using the icon at the topleft.
The Project Navigator
raywenderlich.com
45
Metal by Tutorials
Chapter 2: 3D Models
To show the file extensions, open Xcode Preferences, and on the General tab, choose File Extensions: Show All.
Show File Extensions ➤ From the Project navigator, select Render and Export 3D Model. This page contains the code from Chapter 1, “Hello, Metal!”. Examine the rendered sphere in the playground’s live view. Notice how the sphere renders as a solid red shape and appears flat. To see the edges of each individual triangle, you can render the model in wireframe. ➤ To render in wireframe, add the following line of code just before the draw call: renderEncoder.setTriangleFillMode(.lines)
This code tells the GPU to render lines instead of solid triangles.
raywenderlich.com
46
Metal by Tutorials
Chapter 2: 3D Models
➤ Run the playground:
A sphere rendered in wireframe. There’s a bit of an optical illusion happening here. It may not look like it, but the GPU is rendering straight lines. The reason the sphere edges look curved is because of the number of triangles the GPU is rendering. If you render fewer triangles, curved models tend to look “blocky”. You can really see the 3D nature of the sphere now. The model’s triangles are evenly spaced horizontally, but because you’re viewing on a two dimensional screen, they appear smaller at the edges of the sphere than the triangles in the middle.
raywenderlich.com
47
Metal by Tutorials
Chapter 2: 3D Models
In 3D apps such as Blender or Maya, you generally manipulate points, lines and faces. Points are the vertices; lines, also called edges, are the lines between the vertices; and faces are the triangular flat areas.
Vertex, line and face. The vertices are generally ordered into triangles because GPU hardware is specialized to process them. The GPU’s core instructions are expecting to see a triangle. Of all possible shapes, why a triangle? • A triangle has the least number of points of any polygon that can be drawn in two dimensions. • No matter which way you move the points of a triangle, the three points will always be on the same plane. • When you divide a triangle starting from any vertex, it always becomes two triangles. When you’re modeling in a 3D app, you generally work with quads (four point polygons). Quads work well with subdivision or smoothing algorithms.
raywenderlich.com
48
Metal by Tutorials
Chapter 2: 3D Models
Creating Models With Blender To create 3D models, you need a 3D modeling app. These apps range from free to hugely expensive. The best of the free apps — and the one used throughout this book — is Blender (v. 3.0). A lot of professionals use Blender, but if you’re more familiar with another 3D app — such as Cheetah3D, Maya or Houdini — then you’re welcome to use it since the concepts are the same. ➤ Download and install Blender from https://www.blender.org. ➤ Launch Blender. Click outside of the splash screen to close it, and you’ll see an interface similar to this one:
The Blender Interface. Your interface may look different. However, if you want your Blender interface to look like the image shown here, choose Edit ▸ Preferences…. Click the hamburger menu at the bottom left, choose Load Factory Preferences, and then click the popup Load Factory Preferences, which will appear under the cursor. Click Save Preferences to retain these preferences for future sessions.
raywenderlich.com
49
Metal by Tutorials
Chapter 2: 3D Models
Note: If you want to create your own models, the best place to start is with our Blender tutorial (https://bit.ly/3gwKiel). This tutorial teaches you how to make a mushroom. You can then render your mushroom in your playground at the end of this chapter.
A mushroom modeled in Blender.
3D File Formats There are several standard 3D file formats. Here’s an overview of what each one offers: • .obj: This format, developed by Wavefront Technologies, has been around for awhile, and almost every 3D app supports importing and exporting .obj files. You can specify materials (textures and surface properties) using an accompanying .mtl file, however, this format does not support animation or vertex colors. • .glTF: Developed by Khronos — who oversee Vulkan and OpenGL — this format is relatively new and is still under active development. It has strong community support because of its flexibility. It supports animated models. • .blend: This is the native Blender file format.
raywenderlich.com
50
Metal by Tutorials
Chapter 2: 3D Models
• .fbx: A proprietary format owned by Autodesk. This is a commonly used format that supports animation but is losing favor because it’s proprietary and doesn’t have a single standard. • .usd: A scalable open source format introduced by Pixar. USD can reference many models and files, which is not ideal for sharing assets. .usdz is a USD archive file that contains everything needed for the model or scene. Apple uses the USDZ format for their AR models. An .obj file contains only a single model, whereas .glTF and .usd files are containers for entire scenes, complete with models, animation, cameras and lights. In this book, you’ll use Wavefront OBJ (.obj), USD, USDZ and Blender format (.blend). Note: You can use Apple’s Reality Converter to convert 3D files to USDZ. Apple also provides tools for validating and inspecting USDZ files (https:// apple.co/3gykNcI), as well as a gallery of sample USDZ files (https://apple.co/ 3iJzMBW).
Exporting to Blender Now that you have Blender all set up, it’s time to export a model from your playground into Blender. ➤ Still in Render and Export 3D Model, near the top of the playground where you create the mesh, change: let mdlMesh = MDLMesh( sphereWithExtent: [0.75, 0.75, 0.75], segments: [100, 100], inwardNormals: false, geometryType: .triangles, allocator: allocator)
To: let mdlMesh = MDLMesh( coneWithExtent: [1,1,1], segments: [10, 10], inwardNormals: false, cap: true,
raywenderlich.com
51
Metal by Tutorials
Chapter 2: 3D Models
geometryType: .triangles, allocator: allocator)
This code will generate a primitive cone mesh in place of the sphere. Run the playground, and you’ll see the wireframe cone.
A cone model. This is the model you’ll export using Model I/O. ➤ Open Finder, and in the Documents folder, create a new directory named Shared Playground Data. All of your saved files from the Playground will end up here, so make sure you name it correctly. Note: The global constant playgroundSharedDataDirectory holds this folder name. ➤ To export the cone, add this code just after creating the mesh: // begin export code // 1 let asset = MDLAsset() asset.add(mdlMesh) // 2 let fileExtension = "obj" guard MDLAsset.canExportFileExtension(fileExtension) else { fatalError("Can't export a .\(fileExtension) format") } // 3 do {
raywenderlich.com
52
Metal by Tutorials
Chapter 2: 3D Models
let url = playgroundSharedDataDirectory .appendingPathComponent("primitive.\(fileExtension)") try asset.export(to: url) } catch { fatalError("Error \(error.localizedDescription)") } // end export code
Let’s have a closer look at the code: 1. The top level of a scene in Model I/O is an MDLAsset. You can build a complete scene hierarchy by adding child objects such as meshes, cameras and lights to the asset. 2. Check that Model I/O can export an .obj file type. 3. Export the cone to the directory stored in Shared Playground Data. ➤ Run the playground to export the cone object. Note: If your playground crashes, it’s probably because you haven’t created the Shared Playground Data directory in Documents.
The .obj File Format ➤ In Finder, navigate to Documents ▸ Shared Playground Data. Here, you’ll find the two exported files, primitive.obj and primitive.mtl. ➤ Using a plain text editor, open primitive.obj. The following is an example .obj file. It describes a plane primitive with four corner vertices. The cone .obj file looks similar, except it has more data. # Apple ModelIO OBJ File: plane mtllib plane.mtl g submesh v 0 0.5 -0.5 v 0 -0.5 -0.5 v 0 -0.5 0.5 v 0 0.5 0.5 vn -1 0 0 vt 1 0 vt 0 0 vt 0 1
raywenderlich.com
53
Metal by Tutorials
Chapter 2: 3D Models
vt 1 1 usemtl material_1 f 1/1/1 2/2/1 3/3/1 f 1/1/1 3/3/1 4/4/1 s off
Here’s the breakdown: • mtllib: This is the name of the accompanying .mtl file. This file holds the material details and texture file names for the model. • g: Starts a group of vertices. • v: Vertex. For the cone, you’ll have 102 of these. • vn: Surface normal. This is a vector that points orthogonally — that’s directly outwards. You’ll read more about normals later. • vt: uv coordinate that determines the vertex’s position on a 2D texture. Textures use uv coordinates rather than xy coordinates. • usemtl: The name of a material providing the surface information — such as color — for the following faces. This material is defined in the accompanying .mtl file. • f: Defines faces. You can see here that the plane has two faces, and each face has three elements consisting of a vertex/texture/normal index. In this example, the last face listed: 4/4/1 would be the fourth vertex element / the fourth texture element / the first normal element: 0 0.5 0.5 / 1 1 / -1 0 0. • s: Smoothing, currently off, means there are no groups that will form a smooth surface.
The .mtl File Format The second file you exported contains the model’s materials. Materials describe how the 3D renderer should color the vertex. For example, should the vertex be smooth and shiny? Pink? Reflective? The .mtl file contains values for these properties. ➤ Using a plain text editor, open primitive.mtl: # Apple ModelI/O MTL File: primitive.mtl newmtl material_1 Kd 1 1 1 Ka 0 0 0
raywenderlich.com
54
Metal by Tutorials
Chapter 2: 3D Models
Ks 0 ao 0 subsurface 0 metallic 0 specularTint 0 roughness 0.9 anisotropicRotation 0 sheen 0.05 sheenTint 0 clearCoat 0 clearCoatGloss 0
Here’s the breakdown: • newmtl material_1: This is the group that contains all of the cone’s vertices. • Kd: The diffuse color of the surface. In this case, 1 1 1 will color the object white. • Ka: The ambient color. This models the ambient lighting in the room. • Ks: The specular color. The specular color is the color reflected from a highlight. You’ll read more about these and the other material properties later.
Importing the Cone It’s time to import the cone into Blender. ➤ To start with a clean and empty Blender file: 1. Open Blender. 2. Choose File ▸ New ▸ General. 3. Left-click the cube that appears in the start-up file to select it. 4. Press X to delete the cube. 5. Left-click Delete in the menu under the cursor to confirm the deletion. You now have a clear and ready-for-importing Blender file, so let’s get to it. ➤ Choose File ▸ Import ▸ Wavefront (.obj), and select primitive.obj from the Documents ▸ Shared Playground Data Playground directory.
raywenderlich.com
55
Metal by Tutorials
Chapter 2: 3D Models
The cone imports into Blender.
The cone in Blender. ➤ Left-click the cone to select it, and press Tab to put Blender into Edit Mode. Edit Mode allows you to see the vertices and triangles that make up the cone.
Edit mode While you’re in Edit Mode, you can move the vertices around and add new vertices to create any 3D model you can imagine.
raywenderlich.com
56
Metal by Tutorials
Chapter 2: 3D Models
Note: In the resources directory for this chapter, there’s a file with links to some excellent Blender tutorials. Using only a playground, you now have the ability to create, render and export a primitive. In the next part of this chapter, you’ll review and render a more complex model with separate material groups.
Material Groups ➤ In Blender, open train.blend, which you’ll find in the resources directory for this chapter. This file is the original Blender file of the .obj train in your playground. ➤ Left-click the model to select it, and press Tab to go into Edit Mode.
The train in edit mode. Unlike the cone, the train model has several material groups — one for each color. On the right-hand side of the Blender screen, you’ll see the Properties panel, with the Material context already selected (that’s the icon at the bottom of the vertical list of icons), and the list of materials within this model at the top. ➤ Select Body, and then click Select underneath the material list.
raywenderlich.com
57
Metal by Tutorials
Chapter 2: 3D Models
The vertices assigned to this material are now colored orange.
Material groups Notice how the vertices are separated into different groups or materials. This separation makes it easier to select the various parts within Blender and also gives you the ability to assign different colors. Note: When you first import this model into your playground, the renderer will render each of the material groups, but it may not pick up the correct colors. One way to verify a model’s appearance is to view it in Blender. ➤ Go back to Xcode, and from the Project navigator, open the Import Train playground page. This playground renders — but does not export — the wireframe cone. In the playground’s Resources folder, you’ll see three files: train.mtl, train.obj and train.usd. Note: Files in the Playground Resources folder are available to all playground pages. Files in each page’s Resources folder are only available to that page.
raywenderlich.com
58
Metal by Tutorials
Chapter 2: 3D Models
➤ In Import Train, remove the line where you create the MDLMesh cone: let mdlMesh = MDLMesh( coneWithExtent: [1, 1, 1], segments: [10, 10], inwardNormals: false, cap: true, geometryType: .triangles, allocator: allocator)
Don’t worry about that compile error. You’ve still got some work to do. ➤ Replacing the code you just removed, add this code in its place: guard let assetURL = Bundle.main.url( forResource: "train", withExtension: "obj") else { fatalError() }
This code sets up the file URL for the .obj format of the model. Later, you can try the .usd format, and you should get the same result.
Vertex Descriptors Metal uses descriptors as a common pattern to create objects. You saw this pattern in the previous chapter when you set up a pipeline descriptor to describe a pipeline state. Before loading the model, you’ll tell Metal how to lay out the vertices and other data by creating a vertex descriptor. The following diagram describes an incoming buffer of model vertex data. It has two vertices with position, normal and texture coordinate attributes. The vertex descriptor informs Metal how you want to view this data.
The vertex descriptor
raywenderlich.com
59
Metal by Tutorials
Chapter 2: 3D Models
➤ Add this code below the code you just added: // 1 let vertexDescriptor = MTLVertexDescriptor() // 2 vertexDescriptor.attributes[0].format = .float3 // 3 vertexDescriptor.attributes[0].offset = 0 // 4 vertexDescriptor.attributes[0].bufferIndex = 0
Looking closer: 1. You create a vertex descriptor that you’ll use to configure all of the properties that an object will need to know about. Note: You can reuse this vertex descriptor with either the same values or reconfigured values to instantiate a different object. 2. The .obj file holds normal and texture coordinate data as well as vertex position data. For the moment, you don’t need the surface normals or texture coordinates; you only need the position. You tell the descriptor that the xyz position data should load as a float3, which is a simd data type consisting of three Float values. An MTLVertexDescriptor has an array of 31 attributes where you can configure the data format — and in future chapters, you’ll load the normal and texture coordinate attributes. 3. The offset specifies where in the buffer this particular data will start. 4. When you send your vertex data to the GPU via the render encoder, you send it in an MTLBuffer and identify the buffer by an index. There are 31 buffers available and Metal keeps track of them in a buffer argument table. You use buffer 0 here so that the vertex shader function will be able to match the incoming vertex data in buffer 0 with this vertex layout. ➤ Now add this code below the previous lines: // 1 vertexDescriptor.layouts[0].stride = MemoryLayout.stride // 2 let meshDescriptor = MTKModelIOVertexDescriptorFromMetal(vertexDescriptor) // 3 (meshDescriptor.attributes[0] as! MDLVertexAttribute).name =
raywenderlich.com
60
Metal by Tutorials
Chapter 2: 3D Models
MDLVertexAttributePosition
Going through everything: 1. Here, you specify the stride for buffer 0. The stride is the number of bytes between each set of vertex information. Referring back to the previous diagram which described position, normal and texture coordinate information, the stride between each vertex would be float3 + float3 + float2. However, here you’re only loading position data, so to get to the next position, you jump by a stride of float3. Using the buffer layout index and stride format, you can set up complex vertex descriptors referencing multiple MTLBuffers with different layouts. You have the option of interleaving position, normal and texture coordinates; or you can lay out a buffer containing all of the position data first, followed by other data. Note: The SIMD3 type is Swift’s equivalent to float3. Later, you’ll use a typealias for float3. 2. Model I/O needs a slightly different format vertex descriptor, so you create a new Model I/O descriptor from the Metal vertex descriptor. If you have a Model I/O descriptor and need a Metal one, MTKMetalVertexDescriptorFromModelIO() provides a solution. 3. Assign a string name “position” to the attribute. This tells Model I/O that this is positional data. The normal and texture coordinate data is also available, but with this vertex descriptor, you told Model I/O that you’re not interested in those attributes. ➤ Continue by adding this code: let asset = MDLAsset( url: assetURL, vertexDescriptor: meshDescriptor, bufferAllocator: allocator) let mdlMesh = asset.childObjects(of: MDLMesh.self).first as! MDLMesh
This code reads the asset using the URL, vertex descriptor and memory allocator. You then read in the first Model I/O mesh buffer in the asset. Some more complex objects will have multiple meshes, but you’ll deal with that later.
raywenderlich.com
61
Metal by Tutorials
Chapter 2: 3D Models
Now that you’ve loaded the model vertex information, the rest of the code will be the same, and your playground will load mesh from the new mdlMesh variable. ➤ Run the playground to see your train in wireframe.
Train wireframe wheels Well, that’s not good. The train is missing some wheels, and the ones that are there are way too high off the ground. Plus, the rest of the train is missing! Time to fix these problems, starting with the train’s wheels.
Metal Coordinate System All models have an origin. The origin is the location of the mesh. The train’s origin is at [0, 0, 0]. In Blender, this places the train right at the center of the scene.
The origin raywenderlich.com
62
Metal by Tutorials
Chapter 2: 3D Models
The Metal NDC (Normalized Device Coordinate) system is a 2-unit wide by 2-unit high by 1-unit deep box where X is right / left, Y is up / down and Z is in / out of the screen.
NDC (Normalized Device Coordinate) system To normalize means to adjust to a standard scale. On a screen, you might address a location in screen coordinates of width 0 to 375, whereas the Metal normalized coordinate system doesn’t care what the physical width of a screen is — its coordinates along the X axis are -1.0 to 1.0. In Chapter 6, “Coordinate Spaces”, you’ll learn about various coordinate systems and spaces. Because the origin of the train is at [0,0,0], the train appears halfway up the screen, which is where [0,0,0] is in the Metal coordinate system. ➤ Select train.obj in the Project navigator. The SceneKit editor opens and shows you the train model. Currently, the editor doesn’t apply the materials, so the train appears white on a white background. You can still select the train by clicking somewhere around the center of the window. Do that now, and you’ll see three arrows appear when you select the train. Now, open the Node inspector on the right, and change y Position to -1.
raywenderlich.com
63
Metal by Tutorials
Chapter 2: 3D Models
The Node inspector Note: Typically, you’d change the position of the model in code. However, the purpose of this example is to illustrate how you can affect the model. ➤ Go back to Import Train, and run the playground. The wheels now appear at the bottom of the screen.
Wheels on the ground Now that the wheels are fixed, you’re ready to solve the case of the missing train!
raywenderlich.com
64
Metal by Tutorials
Chapter 2: 3D Models
Submeshes So far, your primitive models included only one material group, and thus one submesh. Here’s a plane with four vertices and two material groups.
Vertices on a plane When Model I/O loads this plane, it places the four vertices in an MTLBuffer. The following image shows the vertex position data and also how two submesh buffers index into the vertex data.
Submesh buffers The first submesh buffer holds the vertex indices of the light-colored triangle ACD. These indices point to vertices 0, 2 and 3. The second submesh buffer holds the indices of the dark triangle ADB. The submesh also has an offset where the submesh buffer starts. The index can be held in either a uint16 or a uint32. The offset of this second submesh buffer would be three times the size of the uint type.
raywenderlich.com
65
Metal by Tutorials
Chapter 2: 3D Models
Winding Order The vertex order, also known as the winding order, is important here. The vertex order of this plane is counter-clockwise, as is the default .obj winding order. With a counter-clockwise winding order, triangles that are defined in counter-clockwise order are facing toward you. Whereas triangles that are in clockwise order are facing away from you. In the next chapter, you’ll go down the graphics pipeline and you’ll see that the GPU can cull triangles that are not facing toward you, saving valuable processing time.
Render Submeshes Currently, you’re only rendering the first submesh, but because the train has several material groups, you’ll need to loop through the submeshes to render them all. ➤ Toward the end of the playground, change: guard let submesh = mesh.submeshes.first else { fatalError() } renderEncoder.drawIndexedPrimitives( type: .triangle, indexCount: submesh.indexCount, indexType: submesh.indexType, indexBuffer: submesh.indexBuffer.buffer, indexBufferOffset: 0)
To: for submesh in mesh.submeshes { renderEncoder.drawIndexedPrimitives( type: .triangle, indexCount: submesh.indexCount, indexType: submesh.indexType, indexBuffer: submesh.indexBuffer.buffer, indexBufferOffset: submesh.indexBuffer.offset ) }
This code loops through the submeshes and issues a draw call for each one. The mesh and submeshes are in MTLBuffers, and the submesh holds the index listing of the vertices in the mesh.
raywenderlich.com
66
Metal by Tutorials
Chapter 2: 3D Models
➤ Run the playground, and your train renders completely — minus the material colors, which you’ll take care of in Chapter 11, “Maps & Materials”.
The final train Congratulations! You’re now rendering 3D models. For now, don’t worry that you’re only rendering them in two dimensions or that the colors aren’t correct. After the next chapter, you’ll know more about the internals of rendering. Following on from that, you’ll learn how to move those vertices into the third dimension.
Challenge If you’re in for a fun challenge, complete the Blender tutorial to make a mushroom (https://bit.ly/3gwKiel), and then export what you make in Blender to an .obj file. If you want to skip the modeling, you’ll find the mushroom.obj file in the resources directory for this chapter. ➤ Import mushroom.obj into the playground and render it. If you use the mushroom from the resources directory, you’ll first have to scale and reposition the mushroom in the SceneKit editor to view it correctly.
Wireframe mushroom If you have difficulty, the completed playground is in the Projects ▸ Challenge directory for this chapter. raywenderlich.com
67
Metal by Tutorials
Chapter 2: 3D Models
Key Points • 3D models consist of vertices. Each vertex has a position in 3D space. • In 3D modeling apps, you create models using quads, or polygons with four vertices. On import, Model I/O converts these quads to triangles. • Triangles are the GPU’s native format. • Blender is a fully-featured professional free 3D modeling, animation and rendering app available from https://www.blender.org. • There are many 3D file formats. Apple has standardized its AR models on Pixar’s USD format in a compressed USDZ format. • Vertex descriptors describe the buffer format for the model’s vertices. You set the GPU pipeline state with the vertex descriptor, so that the GPU knows what the vertex buffer format is. • A model is made up of at least one submesh. This submesh corresponds to a material group where you can define the color and other surface attributes of the group. • Metal Normalized Device Coordinates are -1 to 1 on the X and Y axes, and 0 to 1 on the Z axis. X is left / right, Y is down / up and Z is front / back. • The GPU will render only vertices positioned in Metal NDC.
raywenderlich.com
68
3
Chapter 3: The Rendering Pipeline
Now that you know a bit more about 3D models and rendering, it’s time to take a drive through the rendering pipeline. In this chapter, you’ll create a Metal app that renders a red cube. As you work your way through this chapter, you’ll get a closer look at the hardware that’s responsible for turning your 3D objects into the gorgeous pixels you see onscreen. First up, the GPU and CPU.
raywenderlich.com
69
Metal by Tutorials
Chapter 3: The Rendering Pipeline
The GPU and CPU Every computer comes equipped with a Graphics Processing Unit (GPU) and Central Processing Unit (CPU). The GPU is a specialized hardware component that can process images, videos and massive amounts of data really fast. This operation is known as throughput and is measured by the amount of data processed in a specific unit of time. The CPU, on the other hand, manages resources and is responsible for the computer’s operations. Although the CPU can’t process huge amounts of data like the GPU, it can process many sequential tasks (one after another) really fast. The time necessary to process a task is known as latency. The ideal setup includes low latency and high throughput. Low latency allows for the serial execution of queued tasks, so the CPU can execute the commands without the system becoming slow or unresponsive — and high throughput lets the GPU render videos and games asynchronously without stalling the CPU. Because the GPU has a highly parallelized architecture specialized in doing the same task repeatedly and with little or no data transfers, it can process larger amounts of data. The following diagram shows the major differences between the CPU and GPU.
Differences between CPU and GPU The CPU has a large cache memory and a handful of Arithmetic Logic Unit (ALU) cores. In contrast, the GPU has a small cache memory and many ALU cores. The low latency cache memory on the CPU is used for fast access to temporary resources. The ALU cores on the GPU handle calculations without saving partial results to memory. The CPU typically has only a few cores, while the GPU has hundreds — even thousands of cores. With more cores, the GPU can split the problem into many smaller parts, each running on a separate core in parallel, which helps to hide latency. At the end of processing, the partial results are combined, and the final result is returned to the CPU. But cores aren’t the only thing that matters. raywenderlich.com
70
Metal by Tutorials
Chapter 3: The Rendering Pipeline
Besides being slimmed down, GPU cores also have special circuitry for processing geometry and are often called shader cores. These shader cores are responsible for the beautiful colors you see onscreen. The GPU writes an entire frame at a time to fit the full rendering window; it then proceeds to rendering the next frame as quickly as possible, so it can maintain a respectable frame rate.
CPU sending commands to GPU The CPU continues to issue commands to the GPU, ensuring that the GPU always has work to do. However, at some point, either the CPU will finish sending commands or the GPU will finish processing them. To avoid stalling, Metal on the CPU queues up multiple commands in command buffers and will issue new commands, sequentially, for the next frame without waiting for the GPU to finish the previous frame. This means that no matter who finishes the work first, there will always be more work to do. The GPU part of the graphics pipeline starts after the GPU receives all of the commands and resources. To get started with the rendering pipeline, you’ll set up these commands and resources in a new project.
The Metal Project So far, you’ve been using Playgrounds to learn about Metal. Playgrounds are great for testing and learning new concepts, but it’s also important to understand how to set up a full Metal project using SwiftUI. ➤ In Xcode, create a new project using the Multiplatform App template. ➤ Name your project Pipeline, and fill out your team and organization identifier. Leave all of the checkbox options unchecked. raywenderlich.com
71
Metal by Tutorials
Chapter 3: The Rendering Pipeline
➤ Choose the location for your new project. Excellent, you now have a fancy, new SwiftUI app. ContentView.swift is the main view for the app; this is where you’ll call your Metal view. The MetalKit framework contains an MTKView, which is a special Metal rendering view. This is a UIView on iOS and an NSView on macOS. To interface with UIKit or Cocoa UI elements, you’ll use a Representable protocol that sits between SwiftUI and your MTKView. If you want to understand how this protocol works, you can find the information in our book, SwiftUI Apprentice. This configuration is all rather complicated, so in the resources folder for this chapter, you’ll find a pre-made MetalView.swift. ➤ Drag this file into your project, making sure that you check all of the checkboxes so that you copy the file and add it to both targets.
Adding files to targets ➤ Open MetalView.swift. MetalView is a SwiftUI View structure that contains the MTKView property and hosts the Metal view. ➤ Open ContentView.swift, and change: Text("Hello, world!") .padding
To: VStack { MetalView() .border(Color.black, width: 2) Text("Hello, Metal!") } .padding()
raywenderlich.com
72
Metal by Tutorials
Chapter 3: The Rendering Pipeline
Here, you add MetalView to the view hierarchy and give it a border. ➤ Build and run your application using either the macOS target or the iOS target. You’ll see your hosted MTKView. The advantage of using SwiftUI is that it’s relatively easy to layer UI elements — such as the “Hello Metal” text here — underneath your Metal view.
Initial SwiftUI View You now have a choice. You can subclass MTKView and replace the MTKView in MetalView with the subclassed one. In this case, the subclass’s draw(_:) would get called every frame, and you’d put your drawing code in that method. However, in this book, you’ll set up a Renderer class that conforms to MTKViewDelegate and sets Renderer as a delegate of MTKView. MTKView calls a delegate method every frame, and this is where you’ll place the necessary drawing code. Note: If you’re coming from a different API world, you might be looking for a game loop construct. You do have the option of using CADisplayLink for timing, but Apple introduced MetalKit with its protocols to manage the game loop more easily.
raywenderlich.com
73
Metal by Tutorials
Chapter 3: The Rendering Pipeline
The Renderer Class ➤ Create a new Swift file named Renderer.swift, and replace its contents with the following code: import MetalKit class Renderer: NSObject { init(metalView: MTKView) { super.init() } } extension Renderer: MTKViewDelegate { func mtkView( _ view: MTKView, drawableSizeWillChange size: CGSize ) { }
}
func draw(in view: MTKView) { print("draw") }
Here, you create an initializer and make Renderer conform to MTKViewDelegate with the two MTKView delegate methods: • mtkView(_:drawableSizeWillChange:): Called every time the size of the window changes. This allows you to update render texture sizes and camera projection. • draw(in:): Called every frame. This is where you write your render code. ➤ Open MetalView.swift, and in MetalView, add a property to hold the renderer: @State private var renderer: Renderer?
➤ Change body to: var body: some View { MetalViewRepresentable(metalView: $metalView) .onAppear { renderer = Renderer(metalView: metalView) } }
Here, you initialize the renderer when the metal view first appears.
raywenderlich.com
74
Metal by Tutorials
Chapter 3: The Rendering Pipeline
Initialization Just as you did in the first chapter, you need to set up the Metal environment. Metal has a major advantage over OpenGL in that you’re able to instantiate some objects up-front rather than create them during each frame. The following diagram indicates some of the Metal objects you can create at the start of the app.
Create these outside the render loop • MTLDevice: The software reference to the GPU hardware device. • MTLCommandQueue: Responsible for creating and organizing MTLCommandBuffers every frame. • MTLLibrary: Contains the source code from your vertex and fragment shader functions. • MTLRenderPipelineState: Sets the information for the draw — such as which shader functions to use, what depth and color settings to use and how to read the vertex data. • MTLBuffer: Holds data — such as vertex information — in a form that you can send to the GPU. Typically, you’ll have one MTLDevice, one MTLCommandQueue and one MTLLibrary object in your app. You’ll also have several MTLRenderPipelineState objects that will define the various pipeline states, as well as several MTLBuffers to hold the data. Before you can use these objects, however, you need to initialize them. ➤ Open Renderer.swift, and add these properties to Renderer: static var device: MTLDevice! static var commandQueue: MTLCommandQueue! static var library: MTLLibrary! var mesh: MTKMesh!
raywenderlich.com
75
Metal by Tutorials
Chapter 3: The Rendering Pipeline
var vertexBuffer: MTLBuffer! var pipelineState: MTLRenderPipelineState!
All of these properties are currently implicitly unwrapped optionals for convenience, but you can add error-checking later if you wish. You’re using class properties for the device, the command queue and the library to ensure that only one of each exists. In rare cases, you may require more than one, but in most apps, one is enough. ➤ Still in Renderer.swift, add the following code to init(metalView:) before super.init(): guard let device = MTLCreateSystemDefaultDevice(), let commandQueue = device.makeCommandQueue() else { fatalError("GPU not available") } Renderer.device = device Renderer.commandQueue = commandQueue metalView.device = device
This code initializes the GPU and creates the command queue. ➤ Finally, after super.init(), add this: metalView.clearColor = MTLClearColor( red: 1.0, green: 1.0, blue: 0.8, alpha: 1.0) metalView.delegate = self
This code sets metalView.clearColor to a cream color. It also sets Renderer as the delegate for metalView so that the view will call the MTKViewDelegate drawing methods. ➤ Build and run the app to make sure everything’s set up and working. If everything is good, you’ll see the SwiftUI view as before, and in the debug console, you’ll see the word “draw” repeatedly. Use this console statement to verify that your app is calling draw(in:) for every frame. Note: You won’t see metalView’s cream color because you’re not asking the GPU to do any drawing yet.
raywenderlich.com
76
Metal by Tutorials
Chapter 3: The Rendering Pipeline
Create the Mesh You’ve already created a sphere and a cone using Model I/O; now it’s time to create a cube. ➤ In init(metalView:), before calling super.init(), add this: // create the mesh let allocator = MTKMeshBufferAllocator(device: device) let size: Float = 0.8 let mdlMesh = MDLMesh( boxWithExtent: [size, size, size], segments: [1, 1, 1], inwardNormals: false, geometryType: .triangles, allocator: allocator) do { mesh = try MTKMesh(mesh: mdlMesh, device: device) } catch let error { print(error.localizedDescription) }
This code creates the cube mesh, as you did in the previous chapter. ➤ Then, set up the MTLBuffer that contains the vertex data you’ll send to the GPU. vertexBuffer = mesh.vertexBuffers[0].buffer
This code puts the mesh data in an MTLBuffer. Next, you need to set up the pipeline state so that the GPU will know how to render the data.
Set Up the Metal Library First, set up the MTLLibrary and ensure that the vertex and fragment shader functions are present. ➤ Continue adding code before super.init(): // create the shader function library let library = device.makeDefaultLibrary() Renderer.library = library let vertexFunction = library?.makeFunction(name: "vertex_main") let fragmentFunction = library?.makeFunction(name: "fragment_main")
raywenderlich.com
77
Metal by Tutorials
Chapter 3: The Rendering Pipeline
Here, you set up the default library with some shader function pointers. You’ll create these shader functions later in this chapter. Unlike OpenGL shaders, these functions are compiled when you compile your project, which is more efficient than compiling your functions on the fly. The result is stored in the library.
Create the Pipeline State To configure the GPU’s state, you create a pipeline state object (PSO). This pipeline state can be a render pipeline state for rendering vertices, or a compute pipeline state for running a compute kernel. ➤ Continue adding code before super.init(): // create the pipeline state object let pipelineDescriptor = MTLRenderPipelineDescriptor() pipelineDescriptor.vertexFunction = vertexFunction pipelineDescriptor.fragmentFunction = fragmentFunction pipelineDescriptor.colorAttachments[0].pixelFormat = metalView.colorPixelFormat pipelineDescriptor.vertexDescriptor = MTKMetalVertexDescriptorFromModelIO(mdlMesh.vertexDescriptor) do { pipelineState = try device.makeRenderPipelineState( descriptor: pipelineDescriptor) } catch let error { fatalError(error.localizedDescription) }
The PSO holds a potential state for the GPU. The GPU needs to know its complete state before it can start managing vertices. Here, you set the two shader functions the GPU will call and the pixel format for the texture to which the GPU will write. You also set the pipeline’s vertex descriptor; this is how the GPU will know how to interpret the vertex data that you’ll present in the mesh data MTLBuffer. Note: If you need to use a different data buffer layout or call different vertex or fragment functions, you’ll need additional pipeline states. Creating pipeline states is relatively time-consuming — which is why you do it up-front — but switching pipeline states during frames is fast and efficient. The initialization is complete, and your project compiles. Next up, you’ll start on drawing your model.
raywenderlich.com
78
Metal by Tutorials
Chapter 3: The Rendering Pipeline
Render Frames MTKView calls draw(in:) for every frame; this is where you’ll set up your GPU render
commands. ➤ In draw(in:), replace the print statement with this: guard let commandBuffer = Renderer.commandQueue.makeCommandBuffer(), let descriptor = view.currentRenderPassDescriptor, let renderEncoder = commandBuffer.makeRenderCommandEncoder( descriptor: descriptor) else { return }
You’ll send a series of commands to the GPU contained in command encoders. In one frame, you might have multiple command encoders, and the command buffer manages these. You create a render command encoder using a render pass descriptor. This contains the render target textures that the GPU will draw into. In a complex app, you may well have multiple render passes in one frame, with multiple target textures. You’ll learn how to chain render passes together later. ➤ Continue adding this code: // drawing code goes here // 1 renderEncoder.endEncoding() // 2 guard let drawable = view.currentDrawable else { return } commandBuffer.present(drawable) // 3 commandBuffer.commit()
Here’s a closer look at the code: 1. After adding the GPU commands to a command encoder, you end its encoding. 2. You present the view’s drawable texture to the GPU. 3. When you commit the command buffer, you send the encoded commands to the GPU for execution.
raywenderlich.com
79
Metal by Tutorials
Chapter 3: The Rendering Pipeline
Drawing It’s time to set up the list of commands that the GPU will need to draw your frame. In other words, you’ll: • Set the pipeline state to configure the GPU hardware. • Give the GPU the vertex data. • Issue a draw call using the mesh’s submesh groups. ➤ Still in draw(in:), replace the comment: // drawing code goes here
With: renderEncoder.setRenderPipelineState(pipelineState) renderEncoder.setVertexBuffer(vertexBuffer, offset: 0, index: 0) for submesh in mesh.submeshes { renderEncoder.drawIndexedPrimitives( type: .triangle, indexCount: submesh.indexCount, indexType: submesh.indexType, indexBuffer: submesh.indexBuffer.buffer, indexBufferOffset: submesh.indexBuffer.offset) }
Great, you set up the GPU commands to set the pipeline state and the vertex buffer, and to perform the draw calls on the mesh’s submeshes. When you commit the command buffer at the end of draw(in:), you’re telling the GPU that the data and pipeline are ready, and it’s time for the GPU to take over.
raywenderlich.com
80
Metal by Tutorials
Chapter 3: The Rendering Pipeline
The Render Pipeline Are you ready to investigate the GPU pipeline? Great, let’s get to it! In the following diagram, you can see the stages of the pipeline.
The render pipeline The graphics pipeline takes the vertices through multiple stages, during which the vertices have their coordinates transformed between various spaces. Note: This chapter describes immediate-mode rendering (IMR) architecture. Apple’s chips for iOS since A11, and Silicon for macOS, use tile-based rendering (TBR). New Metal features are able to take advantage of TBR. However, for simplicity, you’ll start off with a basic understanding of general GPU architecture. If you want a preview of some differences, watch Apple’s WWDC 2020 video Bring your Metal app to Apple silicon Macs (https:// developer.apple.com/videos/play/wwdc2020/10631/). As a Metal programmer, you’re only concerned about the Vertex and Fragment Processing stages since they’re the only two programmable stages. Later in the chapter, you’ll write both a vertex shader and a fragment shader. For all the nonprogrammable pipeline stages, such as Vertex Fetch, Primitive Assembly and Rasterization, the GPU has specially designed hardware units to serve those stages.
raywenderlich.com
81
Metal by Tutorials
Chapter 3: The Rendering Pipeline
1 - Vertex Fetch The name of this stage varies among different graphics Application Programming Interfaces (APIs). For example, DirectX calls it Input Assembler. To start rendering 3D content, you first need a scene. A scene consists of models that have meshes of vertices. One of the simplest models is the cube which has six faces (12 triangles). As you saw in the previous chapter, you use a vertex descriptor to define the way vertices are read in along with their attributes — such as position, texture coordinates, normal and color. You do have the option not to use a vertex descriptor and just send an array of vertices in an MTLBuffer, however, if you decide not to use one, you’ll need to know how the vertex buffer is organized ahead of time. When the GPU fetches the vertex buffer, the MTLRenderCommandEncoder draw call tells the GPU whether the buffer is indexed. If the buffer is not indexed, the GPU assumes the buffer is an array, and it reads in one element at a time, in order. In the previous chapter, you saw how Model I/O imports .obj files and sets up their buffers indexed by submesh. This indexing is important because vertices are cached for reuse. For example, a cube has 12 triangles and eight vertices (at the corners). If you don’t index, you’ll have to specify the vertices for each triangle and send 36 vertices to the GPU. This may not sound like a lot, but in a model that has several thousand vertices, vertex caching is important. There is also a second cache for shaded vertices so that vertices that are accessed multiple times are only shaded once. A shaded vertex is one to which color was already applied. But that happens in the next stage. A special hardware unit known as the Scheduler sends the vertices and their attributes on to the Vertex Processing stage.
2 - Vertex Processing In the Vertex Processing stage, vertices are processed individually. You write code to calculate per-vertex lighting and color. More importantly, you send vertex coordinates through various coordinate spaces to reach their position in the final framebuffer.
raywenderlich.com
82
Metal by Tutorials
Chapter 3: The Rendering Pipeline
You briefly learned about shader functions and about the Metal Shading Language (MSL) in Chapter 1, “Hello, Metal!”. Now it’s time to see what happens under the hood at the hardware level. Look at this diagram of the architecture of an AMD GPU:
AMD GPU architecture Going top-down, the GPU has: • 1 Graphics Command Processor: This coordinates the work processes. • 4 Shader Engines (SE): An SE is an organizational unit on the GPU that can serve an entire pipeline. Each SE has a geometry processor, a rasterizer and Compute Units. • 9 Compute Units (CU): A CU is nothing more than a group of shader cores. • 64 shader cores: A shader core is the basic building block of the GPU where all of the shading work is done. In total, the 36 CUs have 2,304 shader cores. Compare that to the number of cores in your 8-core CPU. For mobile devices, the story is a little different. For comparison, look at the following image showing a GPU similar to those in recent iOS devices. Instead of having SEs and CUs, the PowerVR GPU has Unified Shading Clusters (USC).
raywenderlich.com
83
Metal by Tutorials
Chapter 3: The Rendering Pipeline
This particular GPU model has 6 USCs and 32 cores per USC for a total of only 192 cores.
The PowerVR GPU Note: The iPhone X had the first mobile GPU entirely designed in-house by Apple. As it turns out, Apple has not made the GPU hardware specifications public. So what can you do with that many cores? Since these cores are specialized in both vertex and fragment shading, one obvious thing to do is give all the cores work to do in parallel so that the processing of vertices or fragments is done faster. There are a few rules, though. Inside a CU, you can only process either vertices or fragments, and only at one time. (Good thing there’s thirty-six of those!) Another rule is that you can only process one shader function per SE. Having four SE’s lets you combine work in interesting and useful ways. For example, you can run one fragment shader on one SE and a second fragment shader on a second SE at one time. Or you can separate your vertex shader from your fragment shader and have them run in parallel but on different SEs.
raywenderlich.com
84
Metal by Tutorials
Chapter 3: The Rendering Pipeline
Creating a Vertex Shader It’s time to see vertex processing in action. The vertex shader you’re about to write is minimal, but it encapsulates most of the necessary vertex shader syntax you’ll need in this and subsequent chapters. ➤ Create a new file using the Metal File template, and name it Shaders.metal. Then, add this code at the end of the file: // 1 struct VertexIn { float4 position [[attribute(0)]]; }; // 2 vertex float4 vertex_main(const VertexIn vertexIn [[stage_in]]) { return vertexIn.position; }
Going through the code: 1. Create a struct VertexIn to describe the vertex attributes that match the vertex descriptor you set up earlier. In this case, just position. 2. Implement a vertex shader, vertex_main, that takes in VertexIn structs and returns vertex positions as float4 types. Remember that vertices are indexed in the vertex buffer. The vertex shader gets the current vertex index via the [[stage_in]] attribute and unpacks the VertexIn structure cached for the vertex at the current index. Compute Units can process (at one time) batches of vertices up to their maximum number of shader cores. This batch can fit entirely in the CU cache and vertices can thus be reused as needed. The batch will keep the CU busy until the processing is done but other CUs should become available to process the next batch.
raywenderlich.com
85
Metal by Tutorials
Chapter 3: The Rendering Pipeline
As soon as the vertex processing is done, the cache is cleared for the next batches of vertices. At this point, vertices are now ordered and grouped, ready to be sent to the primitive assembly stage.
Vertex processing To recap, the CPU sent the GPU a vertex buffer that you created from the model’s mesh. You configured the vertex buffer using a vertex descriptor that tells the GPU how the vertex data is structured. On the GPU, you created a structure to encapsulate the vertex attributes. The vertex shader takes in this structure as a function argument, and through the [[stage_in]] qualifier, acknowledges that position comes from the CPU via the [[attribute(0)]] position in the vertex buffer. The vertex shader then processes all of the vertices and returns their positions as a float4. Note: When you use a vertex descriptor with attributes, you don’t have to match types. The MTLBuffer position is a float3, whereas VertexIn defines the position as a float4. A special hardware unit known as the Distributer sends the grouped blocks of vertices on to the Primitive Assembly stage.
raywenderlich.com
86
Metal by Tutorials
Chapter 3: The Rendering Pipeline
3 - Primitive Assembly The previous stage sent processed vertices grouped into blocks of data to this stage. The important thing to keep in mind is that vertices belonging to the same geometrical shape (primitive) are always in the same block. That means that the one vertex of a point, or the two vertices of a line, or the three vertices of a triangle, will always be in the same block, hence a second block fetch isn’t necessary.
Primitive assembly Along with vertices, the CPU also sends vertex connectivity information when it issues the draw call command, like this: renderEncoder.drawIndexedPrimitives( type: .triangle, indexCount: submesh.indexCount, indexType: submesh.indexType, indexBuffer: submesh.indexBuffer.buffer, indexBufferOffset: 0)
The first argument of the draw function contains the most important information about vertex connectivity. In this case, it tells the GPU that it should draw triangles from the vertex buffer it sent.
raywenderlich.com
87
Metal by Tutorials
Chapter 3: The Rendering Pipeline
The Metal API provides five primitive types:
The primitive types • point: For each vertex, rasterize a point. You can specify the size of a point that has the attribute [[point_size]] in the vertex shader. • line: For each pair of vertices, rasterize a line between them. If a vertex was already included in a line, it cannot be included again in other lines. The last vertex is ignored if there are an odd number of vertices. • lineStrip: Same as a simple line, except that the line strip connects all adjacent vertices and forms a poly-line. Each vertex (except the first) is connected to the previous vertex. • triangle: For every sequence of three vertices, rasterize a triangle. The last vertices are ignored if they cannot form another triangle. • triangleStrip: Same as a simple triangle, except adjacent vertices can be connected to other triangles as well. There is one more primitive type known as a patch, but this needs special treatment. You’ll read more about patches in Chapter 19, “Tessellation & Terrains”. As you read in the previous chapter, the pipeline specifies the winding order of the vertices. If the winding order is counter-clockwise, and the triangle vertex order is counter-clockwise, the vertices are front-faced; otherwise, the vertices are backfaced and can be culled since you can’t see their color and lighting. Primitives are culled when they’re entirely occluded by other primitives. However, if they’re only partially off-screen, they’ll be clipped. raywenderlich.com
88
Metal by Tutorials
Chapter 3: The Rendering Pipeline
Clipping primitives For efficiency, you should set winding order and enable back-face culling in the pipeline state. At this point, primitives are fully assembled from connected vertices and are ready to move on to the rasterizer.
4 - Rasterization There are two modern rendering techniques currently evolving on separate paths but sometimes used together: ray tracing and rasterization. They are quite different, and both have pros and cons. Ray tracing — which you’ll read more about in Chapter 27, “Rendering With Rays” — is preferred when rendering content that is static and far away, while rasterization is preferred when the content is closer to the camera and more dynamic. With ray tracing, for each pixel on the screen, it sends a ray into the scene to see if there’s an intersection with an object. If yes, change the pixel color to that object’s color, but only if the object is closer to the screen than the previously saved object for the current pixel. Rasterization works the other way around. For each object in the scene, send rays back into the screen and check which pixels are covered by the object. Depth information is kept the same way as for ray tracing, so it will update the pixel color if the current object is closer than the previously saved one. At this point, all connected vertices sent from the previous stage need to be represented on a two-dimensional grid using their X and Y coordinates. This step is known as the triangle setup. Here is where the rasterizer needs to calculate the slope or steepness of the line segments between any two vertices. When the three slopes for the three vertices are known, the triangle can be formed from these three edges. raywenderlich.com
89
Metal by Tutorials
Chapter 3: The Rendering Pipeline
Next, a process known as scan conversion runs on each line of the screen to look for intersections and to determine what’s visible and what’s not. To draw on the screen at this point, you need only the vertices and the slopes they determine. The scan algorithm determines if all the points on a line segment or all the points inside of a triangle are visible, in which case the triangle is filled with color entirely.
Rasterizing triangles For mobile devices, the rasterization takes advantage of the tiled architecture of PowerVR GPUs by rasterizing the primitives on a 32x32 tile grid in parallel. In this case, 32 is the number of screen pixels assigned to a tile, but this size perfectly fits the number of cores in a USC. What if one object is behind another object? How can the rasterizer determine which object to render? This hidden surface removal problem can be solved by using stored depth information (early-Z testing) to determine whether each point is in front of other points in the scene. After rasterization is finished, three more specialized hardware units take the stage: • A buffer known as Hierarchical-Z is responsible for removing fragments that were marked for culling by the rasterizer. • The Z and Stencil Test unit then removes non-visible fragments by comparing them against the depth and stencil buffer. • Finally, the Interpolator unit takes the remaining visible fragments and generates fragment attributes from the assembled triangle attributes. At this point, the Scheduler unit, again, dispatches work to the shader cores, but this time it’s the rasterized fragments sent for Fragment Processing.
raywenderlich.com
90
Metal by Tutorials
Chapter 3: The Rendering Pipeline
5 - Fragment Processing Time for a quick review of the pipeline.
Fragment Processing • The Vertex Fetch unit grabs vertices from the memory and passes them to the Scheduler unit. • The Scheduler unit knows which shader cores are available, so it dispatches work on them. • After the work is done, the Distributer unit knows if this work was Vertex or Fragment Processing. If the work was Vertex Processing, it sends the result to the Primitive Assembly unit. This path continues to the Rasterization unit, and then back to the Scheduler unit. If the work was Fragment Processing, it sends the result to the Color Writing unit. • Finally, the colored pixels are sent back to the memory. The primitive processing in the previous stages is sequential because there’s only one Primitive Assembly unit and one Rasterization unit. However, as soon as fragments reach the Scheduler unit, work can be forked (divided) into many tiny parts, and each part is given to an available shader core. raywenderlich.com
91
Metal by Tutorials
Chapter 3: The Rendering Pipeline
Hundreds or even thousands of cores are now doing parallel processing. When the work is complete, the results will be joined (merged) and sent to the memory, again sequentially. The fragment processing stage is another programmable stage. You create a fragment shader function that will receive the lighting, texture coordinate, depth and color information that the vertex function outputs. The fragment shader output is a single color for that fragment. Each of these fragments will contribute to the color of the final pixel in the framebuffer. All of the attributes are interpolated for each fragment.
Fragment interpolation For example, to render this triangle, the vertex function would process three vertices with the colors red, green and blue. As the diagram shows, each fragment that makes up this triangle is interpolated from these three colors. Linear interpolation simply averages the color at each point on the line between two endpoints. If one endpoint has red color, and the other has green color, the midpoint on the line between them will be yellow. And so on. The interpolation equation is parametric and has this form, where parameter p is the percentage (or a range from 0 to 1) of a color’s presence: newColor = p * oldColor1 + (1 - p) * oldColor2
Color is easy to visualize, but the other vertex function outputs are also similarly interpolated for each fragment. Note: If you don’t want a vertex output to be interpolated, add the attribute [[flat]] to its definition.
raywenderlich.com
92
Metal by Tutorials
Chapter 3: The Rendering Pipeline
Creating a Fragment Shader ➤ In Shaders.Metal, add the fragment function to the end of the file: fragment float4 fragment_main() { return float4(1, 0, 0, 1); }
This is the simplest fragment function possible. You return the interpolated color red in the form of a float4. All the fragments that make up the cube will be red. The GPU takes the fragments and does a series of post-processing tests: • alpha-testing determines which opaque objects are drawn (and which are not) based on depth testing. • In the case of translucent objects, alpha-blending will combine the color of the new object with that already saved in the color buffer previously. • scissor testing checks whether a fragment is inside of a specified rectangle; this test is useful for masked rendering. • stencil testing checks how the stencil value in the framebuffer where the fragment is stored, compares to a specified value we choose. • In the previous stage early-Z testing ran; now a late-Z testing is done to solve more visibility issues; stencil and depth tests are also useful for ambient occlusion and shadows. • Finally, antialiasing is also calculated here so that final images that get to the screen do not look jagged. You’ll learn more about post-processing tests in Chapter 20, “Fragment PostProcessing”.
6 - Framebuffer As soon as fragments have been processed into pixels, the Distributer unit sends them to the Color Writing unit. This unit is responsible for writing the final color in a special memory location known as the framebuffer. From here, the view gets its colored pixels refreshed every frame. But does that mean the color is written to the framebuffer while being displayed on the screen?
raywenderlich.com
93
Metal by Tutorials
Chapter 3: The Rendering Pipeline
A technique known as double-buffering is used to solve this situation. While the first buffer is being displayed on the screen, the second one is updated in the background. Then, the two buffers are swapped, and the second one is displayed on the screen while the first one is updated, and the cycle continues. Whew! That was a lot of hardware information to take in. However, the code you’ve written is what every Metal renderer uses, and despite just starting out, you should begin to recognize the rendering process when you look at Apple’s sample code. ➤ Build and run the app, and you’ll see a beautifully rendered red cube:
A rendered cube Notice how the cube is not square. Remember that Metal uses Normalized Device Coordinates (NDC) that is -1 to 1 on the X axis. Resize your window, and the cube will maintain a size relative to the size of the window. In Chapter 6, “Coordinate Spaces”, you’ll learn how to position objects precisely on the screen. What an incredible journey you’ve had through the rendering pipeline. In the next chapter, you’ll explore vertex and fragment shaders in greater detail.
raywenderlich.com
94
Metal by Tutorials
Chapter 3: The Rendering Pipeline
Challenge Using the train.usd model in the resources folder for this project, replace the cube with this train. When importing the model, be sure to select Create Groups and remember to add the model to both targets. Instead of changing the model’s vertical position in the SceneKit editor, change it in the vertex function using this code: float4 position = vertexIn.position; position.y -= 1.0;
Finally, color your train blue.
Challenge result Refer to the previous chapter for asset loading and the vertex descriptor code if you need help. The finished code for this challenge is in the project challenge directory for this chapter.
raywenderlich.com
95
Metal by Tutorials
Chapter 3: The Rendering Pipeline
Key Points • CPUs are best for processing sequential tasks fast, whereas GPUs excel at processing small tasks synchronously. • SwiftUI is a great host for MTKViews, as you can layer UI elements easily. • Separate Metal tasks where you can to the initialize phase. Initialize the device, command queues, pipeline states and model data buffers once at the start of your app. • Each frame, create a command buffer and one or more command encoders. • GPU architecture allows for a strict pipeline. Configure this using PSOs (pipeline state objects). • There are two programmable stages in a simple rendering GPU pipeline. You calculate vertex positions using the vertex shader, and calculate the color that appears on the screen using the fragment shader.
raywenderlich.com
96
4
Chapter 4: The Vertex Function
So far, you’ve worked your way through 3D models and the graphics pipeline. Now, it’s time to look at the first of two programmable stages in Metal, the vertex stage — and more specifically, the vertex function.
raywenderlich.com
97
Metal by Tutorials
Chapter 4: The Vertex Function
Shader Functions There are three types of shader functions: • Vertex function: Calculates the position of a vertex. • Fragment function: Calculates the color of a fragment. • Kernel function: Used for general-purpose parallel computations, such as image processing. In this chapter, you’ll focus only on the vertex function. In Chapter 7, “The Fragment Function”, you’ll explore how to control the color of each fragment. And in Chapter 16, “GPU Compute Programming”, you’ll discover how to use parallel programming with multiple threads to write to buffers and textures. By now, you should be familiar with vertex descriptors, and how to use them to describe how to lay out the vertex attributes from your loaded 3D model. To recap: • MDLVertexDescriptor: You use a Model I/O vertex descriptor to read in the .obj file. Model I/O creates buffers with the desired layout of attributes, such as position, normals and texture coordinates. • MTLVertexDescriptor: You use a Metal vertex descriptor when creating the pipeline state. The GPU vertex function uses the [[stage_in]] attribute to match the incoming data with the vertex descriptor in the pipeline state. As you work through this chapter, you’ll construct your own vertex mesh and send vertices to the GPU without using a vertex descriptor. You’ll learn how to manipulate these vertices in the vertex function, and then you’ll upgrade to using a vertex descriptor. In the process, you’ll see how using Model I/O to import your meshes does a lot of the heavy lifting for you.
The Starter Project ➤ Open the starter project. This SwiftUI project contains a reduced Renderer so that you can add your own mesh, and the shader functions are bare-bones so that you can build them up. You’re not doing any drawing yet, so there’s nothing to see when you run the app.
raywenderlich.com
98
Metal by Tutorials
Chapter 4: The Vertex Function
Rendering a Quad You create a quad using two triangles — and each triangle has three vertices, for a total of six vertices.
A quad mesh ➤ Create a new Swift file named Quad.swift. ➤ Replace the existing code with: import MetalKit struct Quad { var vertices: [Float] = [ -1, 1, 0, // triangle 1 1, -1, 0, -1, -1, 0, -1, 1, 0, // triangle 2 1, 1, 0, 1, -1, 0 ] }
As you know, a vertex is made of an x, y and z value. Each group of three Floats in vertices describes one vertex. Here, the winding order of the points is clockwise, which is important.
raywenderlich.com
99
Metal by Tutorials
Chapter 4: The Vertex Function
➤ Add a new vertex buffer property to Quad and initialize it: let vertexBuffer: MTLBuffer init(device: MTLDevice, scale: Float = 1) { vertices = vertices.map { $0 * scale } guard let vertexBuffer = device.makeBuffer( bytes: &vertices, length: MemoryLayout.stride * vertices.count, options: []) else { fatalError("Unable to create quad vertex buffer") } self.vertexBuffer = vertexBuffer }
With this code, you initialize the Metal buffer with the array of vertices. You multiply each vertex by scale, which lets you set the size of the quad during initialization. ➤ Open Renderer.swift, and add a new property for the quad mesh: lazy var quad: Quad = { Quad(device: Renderer.device, scale: 0.8) }()
Here, you initialize quad with Renderer’s device — and because you initialize device in init(metalView:), you must initialize quad lazily. You also resize the quad so that you can see it properly. (If you were to leave the scale at the default of 1.0, the quad would cover the entire screen. Covering the screen is useful for full-screen drawing since you can only draw fragments where you’re rendering geometry.) ➤ In draw(in:), after // do drawing here, add: renderEncoder.setVertexBuffer( quad.vertexBuffer, offset: 0, index: 0)
You create a command on the render command encoder to set the vertex buffer in the buffer argument table at index 0. ➤ Add the draw call: renderEncoder.drawPrimitives( type: .triangle, vertexStart: 0, vertexCount: quad.vertices.count)
raywenderlich.com
100
Metal by Tutorials
Chapter 4: The Vertex Function
Here, you draw the quad’s six vertices. ➤ Open Shaders.metal. ➤ Replace the vertex function with: vertex float4 vertex_main( constant float3 *vertices [[buffer(0)]], uint vertexID [[vertex_id]]) { float4 position = float4(vertices[vertexID], 1); return position; }
There’s an error with this code, which you’ll observe and fix shortly. The GPU performs the vertex function for each vertex. In the draw call, you specified that there are six vertices. So, the vertex function will perform six times. When you pass a pointer into the vertex function, you must specify an address space, either constant or device. constant is optimized for accessing the same variable over several vertex functions in parallel. device is best for accessing different parts of a buffer over the parallel functions — such as when using a buffer with points and color data interleaved. [[vertex_id]] is an attribute qualifier that gives you the current vertex. You can use this as an entry into the array held in vertices.
You might notice that you’re sending the GPU a buffer that you filled with an array of Floats. In the vertex function, you read the same buffer as an array of float3s, leading to an error in the display. ➤ Build and run.
A rendering error raywenderlich.com
101
Metal by Tutorials
Chapter 4: The Vertex Function
Although you might get a different render, the vertices are in the wrong position because a float3 type takes up more memory than three Float types. The SIMD float3 type is padded and takes up the same memory as the float4 type, which is 16 bytes. Changing this parameter to a packed_float3 will fix the error since a packed_float3 takes up 12 bytes. Note: You can check the sizes of types in the Metal Shading Language Specification at https://apple.co/2UT993x. In the vertex function, change float3 in the first parameter to packed_float3. ➤ Build and run.
The rendering error corrected The quad now displays correctly. Alternatively, you could have defined the Float array vertices as an array of simd_float3. In that case, you’d use float3 in the vertex function, as both types take 16 bytes. However, sending 16 bytes per vertex is slightly less efficient than sending 12 bytes per vertex.
Calculating Positions Metal is all about gorgeous graphics and fast, smooth animation. As a next step, you’ll make your quad move up and down the screen. To do this, you’ll have a timer that updates every frame, and the position of each vertex will depend on this timer.
raywenderlich.com
102
Metal by Tutorials
Chapter 4: The Vertex Function
The vertex function is where you update vertex positions, so you’ll send the timer data to the GPU. ➤ Open Renderer.swift, and add a new property to Renderer: var timer: Float = 0
➤ In draw(in:), right before: renderEncoder.setRenderPipelineState(pipelineState)
➤ Add the following code: // 1 timer += 0.005 var currentTime = sin(timer) // 2 renderEncoder.setVertexBytes( ¤tTime, length: MemoryLayout.stride, index: 11)
Let’s have a closer look: 1. For every frame, you update the timer. You want your cube to move up and down the screen, so you’ll use a value between -1 and 1. Using sin() is a great way to achieve this balance as sine values are always -1 to 1. You can change the speed of your animation by changing the value that you add to this timer for each frame. 2. If you’re only sending a small amount of data — say less than 4KB — to the GPU, setVertexBytes(_:length:index:) is an alternative to setting up an MTLBuffer. Here, you set currentTime to index 11 in the buffer argument table. Keeping buffers 1 through 10 for vertex attributes — such as vertex positions — helps you to remember which buffers hold what data. ➤ Open Shaders.metal, and replace the vertex function: vertex float4 vertex_main( constant packed_float3 *vertices [[buffer(0)]], constant float &timer [[buffer(11)]], uint vertexID [[vertex_id]]) { float4 position = float4(vertices[vertexID], 1); position.y += timer; return position; }
raywenderlich.com
103
Metal by Tutorials
Chapter 4: The Vertex Function
You receive the single value timer as a float in buffer 11. You add the timer value to the y position and return the new position from the function. In the next chapter, you’ll start learning how to project vertices into 3D space using matrix multiplication. But, you don’t always need matrix multiplication to move vertices; here, you can achieve the translation of the position in y using simple addition. ➤ Build and run the app, and you’ll see a lovely animated quad.
An animated quad
More Efficient Rendering Currently, you’re using six vertices to render two triangles.
The mesh of two triangles raywenderlich.com
104
Metal by Tutorials
Chapter 4: The Vertex Function
Of those vertices, 0 and 3 are in the same position, as are 1 and 5. If you render a mesh with thousands — or even millions of vertices, it’s important to reduce duplication as much as possible. You can do this with indexed rendering. Create a structure of only unique positions, and then use indices to get the right position for a vertex. ➤ Open Quad.swift, and rename vertices to oldVertices. ➤ Add the following structures to Quad: var vertices: [Float] = [ -1, 1, 0, 1, 1, 0, -1, -1, 0, 1, -1, 0 ] var indices: [UInt16] = [ 0, 3, 2, 0, 1, 3 ] vertices now holds the unique four points of the quad in any order. indices holds the index of each vertex in the correct vertex order. Refer to oldVertices to make
sure your indices are correct. ➤ Add a new Metal buffer to hold indices: let indexBuffer: MTLBuffer
➤ At the end of init(device:scale:), add: guard let indexBuffer = device.makeBuffer( bytes: &indices, length: MemoryLayout.stride * indices.count, options: []) else { fatalError("Unable to create quad index buffer") } self.indexBuffer = indexBuffer
You create the index buffer the same way you did the vertex buffer. ➤ Open Renderer.swift, and in draw(in:) before the draw call, add: renderEncoder.setVertexBuffer( quad.indexBuffer, offset: 0,
raywenderlich.com
105
Metal by Tutorials
Chapter 4: The Vertex Function
index: 1)
Here, you send the index buffer to the GPU. ➤ Change the draw call to: renderEncoder.drawPrimitives( type: .triangle, vertexStart: 0, vertexCount: quad.indices.count)
Use the index count for the number of vertices to render; not the vertex count. ➤ Open Shaders.metal, and change the vertex function to: vertex float4 vertex_main( constant packed_float3 *vertices [[buffer(0)]], constant ushort *indices [[buffer(1)]], constant float &timer [[buffer(11)]], uint vertexID [[vertex_id]]) { ushort index = indices[vertexID]; float4 position = float4(vertices[index], 1); return position; }
Here, vertexID is the index into the buffer holding the quad indices. You use the value in the indices buffer to index the correct vertex in the vertex buffer. ➤ Build and run. Sure, your quad is positioned the same way as before, but now you’re sending less data to the GPU.
Indexed mesh From the number of entries in arrays, it might appear as if you’re actually sending more data — but you’re not! The memory footprint of oldVertices is 72 bytes, whereas the footprint of vertices + indices is 60 bytes. raywenderlich.com
106
Metal by Tutorials
Chapter 4: The Vertex Function
Vertex Descriptors A more efficient draw call is available when you use indices for rendering vertices. However, you first need to set up a vertex descriptor in the pipeline. It’s always a good idea to use vertex descriptors, as most often, you won’t only send positions to the GPU. You’ll also send attributes such as normals, texture coordinates and colors. When you can lay out your own vertex data, you have more control over how your engine handles model meshes. ➤ Create a new Swift file named VertexDescriptor.swift. ➤ Replace the code with: import MetalKit extension MTLVertexDescriptor { static var defaultLayout: MTLVertexDescriptor { let vertexDescriptor = MTLVertexDescriptor() vertexDescriptor.attributes[0].format = .float3 vertexDescriptor.attributes[0].offset = 0 vertexDescriptor.attributes[0].bufferIndex = 0
}
}
let stride = MemoryLayout.stride * 3 vertexDescriptor.layouts[0].stride = stride return vertexDescriptor
Here, you set up a vertex layout that has only one attribute. This attribute describes the position of each vertex. A vertex descriptor holds arrays of attributes and buffer layouts. • attributes: For each attribute, you specify the type format and offset in bytes of the first item from the beginning of the buffer. You also specify the index of the buffer that holds the attribute. • buffer layout: You specify the length of the stride of all attributes combined in each buffer. It may be confusing here as you’re using index 0 to index into both layouts and attributes, but the layouts index 0 corresponds to the bufferIndex 0 used by attributes.
raywenderlich.com
107
Metal by Tutorials
Chapter 4: The Vertex Function
Note: stride describes how many bytes are between each instance. Due to internal padding and byte alignment, this value can be different from size. For an excellent explanation of size, stride and alignment, check out Greg Heo’s article at https://bit.ly/2V3gBJl. To the GPU, the vertexBuffer now looks like this:
Vertex buffer layout ➤ Open Renderer.swift, and locate where you create the pipeline state in init(metalView:). ➤ Before creating the pipeline state in do {}, add the following code to the pipeline state descriptor: pipelineDescriptor.vertexDescriptor = MTLVertexDescriptor.defaultLayout
The GPU will now expect vertices in the format described by this vertex descriptor. ➤ In draw(in:), remove: renderEncoder.setVertexBuffer( quad.indexBuffer, offset: 0, index: 1)
You’ll include the index buffer in the draw call.
raywenderlich.com
108
Metal by Tutorials
Chapter 4: The Vertex Function
➤ Change the draw call to: renderEncoder.drawIndexedPrimitives( type: .triangle, indexCount: quad.indices.count, indexType: .uint16, indexBuffer: quad.indexBuffer, indexBufferOffset: 0)
This draw call expects the index buffer to use UInt16, which is how you described your indices array in Quad. You don’t explicitly send quad.indexBuffer to the GPU because this draw call will do it for you. ➤ Open Shaders.metal. ➤ Replace the vertex function with: vertex float4 vertex_main( float4 position [[attribute(0)]] [[stage_in]], constant float &timer [[buffer(11)]]) { return position; }
You did all the heavy lifting for the layout on the Swift side, so that takes the size of the vertex function way down. :] You describe each per-vertex input with the [[stage_in]] attribute. The GPU now looks at the pipeline state’s vertex descriptor. [[attribute(0)]] is the attribute in the vertex descriptor that describes the position. Even though you defined your original vertex data as three Floats, you can define the position as float4 here. The GPU can make the conversion.
It’s worth noting that when the GPU adds the w information to the xyz position, it adds 1.0. As you’ll see in the following chapters, this w value is quite important during rasterization. The GPU now has all of the information it needs to calculate the position for each vertex.
raywenderlich.com
109
Metal by Tutorials
Chapter 4: The Vertex Function
➤ Build and run the app to ensure that everything still works. The resulting render will be the same as before.
Rendering using a vertex descriptor
Adding Another Vertex Attribute You probably won’t ever have just one attribute, so let’s add a color attribute for each vertex. You have a choice whether to use two buffers or interleave the color between each vertex position. If you choose to interleave, you’ll set up a structure to hold position and color. In this example, however, it’s easier to add a new colors buffer to match each vertex. ➤ Open Quad.swift, and add the new array: var colors: [simd_float3] = [ [1, 0, 0], // red [0, 1, 0], // green [0, 0, 1], // blue [1, 1, 0] // yellow ]
You now have four RGB colors to match the four vertices. ➤ Create a new buffer property: let colorBuffer: MTLBuffer
➤ At the end of init(device:scale:), add: guard let colorBuffer = device.makeBuffer(
raywenderlich.com
110
Metal by Tutorials
Chapter 4: The Vertex Function
bytes: &colors, length: MemoryLayout.stride * indices.count, options: []) else { fatalError("Unable to create quad color buffer") } self.colorBuffer = colorBuffer
You initialize colorBuffer the same way as the previous two buffers. ➤ Open Renderer.swift, and in draw(in:) right before the draw call, add: renderEncoder.setVertexBuffer( quad.colorBuffer, offset: 0, index: 1)
You send the color buffer to the GPU using buffer index 1, which must match the index in the vertex descriptor layout. ➤ Open VertexDescriptor.swift, and add the following code to defaultLayout before return: vertexDescriptor.attributes[1].format = .float3 vertexDescriptor.attributes[1].offset = 0 vertexDescriptor.attributes[1].bufferIndex = 1 vertexDescriptor.layouts[1].stride = MemoryLayout.stride
Here, you describe the layout of the color buffer in buffer index 1. ➤ Open Shaders.metal. ➤ You can only use [[stage_in]] on one parameter, so create a new structure: struct VertexIn { float4 position [[attribute(0)]]; float4 color [[attribute(1)]]; };
➤ Change the vertex function to: vertex float4 vertex_main( VertexIn in [[stage_in]], constant float &timer [[buffer(11)]]) { return in.position; }
raywenderlich.com
111
Metal by Tutorials
Chapter 4: The Vertex Function
This code is still short and concise. The GPU knows how to retrieve position and color from the buffers because of the [[attribute(n)]] qualifier in the structure, which looks at the pipeline state’s vertex descriptor. ➤ Build and run to ensure your blue quad still renders.
Quad with two attributes The fragment function determines the color of each rendered fragment. You need to pass the vertex’s color to the fragment function. You’ll learn more about the fragment function in Chapter 7, “The Fragment Function”. ➤ Still in Shaders.metal, add this structure: struct VertexOut { float4 position [[position]]; float4 color; };
Instead of returning just the position from the vertex function, you’ll now return both position and color. You specify a position attribute to let the GPU know which property in this structure is the position. ➤ Replace the vertex function with: vertex VertexOut vertex_main( VertexIn in [[stage_in]], constant float &timer [[buffer(11)]]) { VertexOut out { .position = in.position, .color = in.color }; return out; }
You now return a VertexOut instead of a float4.
raywenderlich.com
112
Metal by Tutorials
Chapter 4: The Vertex Function
➤ Change the fragment function to: fragment float4 fragment_main(VertexOut in [[stage_in]]) { return in.color; }
The [[stage_in]] attribute indicates that the GPU should take the VertexOut output from the vertex function and match it with the rasterized fragments. Here, you return the vertex color. Remember from Chapter 3, “The Rendering Pipeline”, that each fragment’s input gets interpolated. ➤ Build and run the app, and you’ll see the quad rendered with beautiful colors.
Interpolated vertex colors
Rendering Points Instead of rendering triangles, you can render points and lines. ➤ Open Renderer.swift, and in draw(in:), change: renderEncoder.drawIndexedPrimitives( type: .triangle,
➤ To: renderEncoder.drawIndexedPrimitives( type: .point,
If you build and run now, the GPU will render the points, but it doesn’t know what point size to use, so it flickers over various point sizes. To fix this problem, you’ll also return a point size when returning data from the vertex function. raywenderlich.com
113
Metal by Tutorials
Chapter 4: The Vertex Function
➤ Open Shaders.metal, and add this property to VertexOut: float pointSize [[point_size]];
The [[point_size]] attribute will tell the GPU what point size to use. ➤ Replace the initialization of out with: VertexOut out { .position = in.position, .color = in.color, .pointSize = 30 };
Here, you assign the point size of 30. ➤ Build and run to see your points rendered with their vertex color:
Rendering points
raywenderlich.com
114
Metal by Tutorials
Chapter 4: The Vertex Function
Challenge So far, you’ve sent vertex positions to the GPU in an array buffer. But this isn’t entirely necessary. All the GPU needs to know is how many vertices to draw. Your challenge is to remove the vertex and index buffers, and draw 50 points in a circle. Here’s an overview of the steps you’ll need to take, along with some code to get you started: 1. In Renderer, remove the vertex descriptor from the pipeline. 2. Replace the draw call in Renderer so that it doesn’t use indices but does draw 50 vertices. 3. In draw(in:), remove all of the setVertexBuffer commands. 4. The GPU will need to know the total number of points, so send this value the same way you did timer in buffer 0. 5. Replace the vertex function with: vertex VertexOut vertex_main( constant uint &count [[buffer(0)]], constant float &timer [[buffer(11)]], uint vertexID [[vertex_id]]) { float radius = 0.8; float pi = 3.14159; float current = float(vertexID) / float(count); float2 position; position.x = radius * cos(2 * pi * current); position.y = radius * sin(2 * pi * current); VertexOut out { .position = float4(position, 0, 1), .color = float4(1, 0, 0, 1), .pointSize = 20 }; return out; }
Remember, this is an exercise to help you understand how to position points on the GPU without holding any equivalent data on the Swift side. So, don’t worry too much about the math. You can use the sine and cosine of the current vertex ID to plot the point around a circle.
raywenderlich.com
115
Metal by Tutorials
Chapter 4: The Vertex Function
Notice that there’s no built-in value for pi on the GPU. You’ll see your 50 points plotted into a circle.
Points in a circle Try animating the points by adding timer to current. If you have any difficulties, you can find the solution in the project challenge directory for this chapter.
Key Points • The vertex function’s fundamental task is positioning vertices. When you render a model, you send the GPU the model’s vertices in its original position. The vertex shader will then reposition those vertices to the correct spot in your 3D world. • Shader code uses attributes such as [[buffer(0)]] and [position] extensively. To find out more about these attributes, refer to the Metal Shading Language specification document (https://apple.co/3hPTbjQ). • You can pass any data in an MTLBuffer to the GPU using setVertexBuffer(_:offset:index:). If the data is less than 4KB, you don’t have to set up a buffer; you can, instead, pass a structure using setVertexBytes(_:length:index:). • When possible, use indexed rendering. With indexed rendering, you pass less data to the GPU — and memory bandwidth is a major bottleneck. • When possible, use vertex descriptors. With vertex descriptors, the GPU knows the format of the data being passed, and you’ll get fewer errors in your code when you change a type on the Swift side and forget to change the shader function. raywenderlich.com
116
5
Chapter 5: 3D Transformations
In the previous chapter, you translated vertices and moved objects around the screen by calculating the position data in the vertex function. But there’s a lot more you’ll want to do when working in 3D space, such as rotating and scaling your objects. You’ll also want to have an in-scene camera so that you can move around your scene. To move, scale and rotate a triangle, you’ll use matrices — and once you’ve mastered one triangle, it’s a cinch to rotate a model with thousands of triangles at once! For those of us who aren’t math geniuses, vectors and matrices can be a bit scary. Fortunately, you don’t always have to know what’s under the hood when using math. To help, this chapter focuses not on the math, but the matrices. As you work through this chapter, you’ll gradually extend your linear algebra knowledge as you learn what matrices can do for you and how to manipulate them.
raywenderlich.com
117
Metal by Tutorials
Chapter 5: 3D Transformations
Transformations Look at the following picture.
Affine Transformations Using the vector image editor, Affinity Designer, you can scale and rotate a cat through a series of affine transformations. Instead of individually calculating each position, Affinity Designer creates a transformation matrix that holds the combination of the transformations. It then applies the transformation to each element. Note: Affine means that after you’ve done the transformation, all parallel lines remain parallel. Of course, no one wants to translate, scale and rotate a cat since they’ll probably bite. So instead, you’ll translate, scale and rotate a triangle.
raywenderlich.com
118
Metal by Tutorials
Chapter 5: 3D Transformations
The Starter Project & Setup ➤ Open and run the starter project located in the starter folder for this chapter.
The starter project This project renders a Triangle twice rather than a Quad. In Renderer, you’ll see two draw calls (one for each triangle). Renderer passes position to the vertex function and color to the fragment function; it does this for each triangle. The gray triangle is at its original position, and the red triangle has transformations. ➤ Before moving on to the next step, make sure you understand the code in Renderer’s draw(in:) and the vertex function in Shaders.metal. ContentView.swift is now located in the group SwiftUI Views, and it displays a grid over the metal view so that you can visualize your vertex positions more easily.
raywenderlich.com
119
Metal by Tutorials
Chapter 5: 3D Transformations
Setting Up the Preview Using SwiftUI When making small tweaks, you can preview your changes using SwiftUI rather than running the project each time. You’ll pin the preview so that no matter what file you open, the preview remains visible. Note: If you have a small screen, you may prefer to run the project each time to test it. ➤ Open ContentView.swift. ➤ At the top-right of the Xcode window, click Adjust Editor Options and choose Canvas from the menu. You can change the layout from this menu so that the preview canvas is on the right.
Showing the preview canvas ➤ Build the project using the shortcut Command-B, and press Option-Command-P to resume the preview. (If the resume preview shortcut doesn’t work, try Editor Menu ▸ Canvas ▸ Refresh Canvas. The shortcut should now work. You may also have to sign your team in the project’s target.)
raywenderlich.com
120
Metal by Tutorials
Chapter 5: 3D Transformations
Note: Whenever you make changes to your code, you’ll need to build your app using Command-B. You then need to press Option-Command-P to resume/ refresh the preview as the preview will not automatically build the Metal shaders.
The preview ➤ At the bottom-left of the preview window, you’ll see an option to pin the preview.
Pinning the preview With the preview pinned, you can now move to other files without the preview disappearing.
raywenderlich.com
121
Metal by Tutorials
Chapter 5: 3D Transformations
Translation The starter project renders two triangles: • A gray triangle without any transformations. • A red triangle translated with position = simd_float3(0.3, -0.4, 0).
Displacement Vectors In the first challenge in the previous chapter, you calculated the position of each vertex in the shader function. A more common computer graphics paradigm is to set the position of each vertex of the model in the vertex buffer, and then send a matrix to the vertex shader that contains the model’s current position, rotation and scale.
Vectors & Matrices You can better describe position as a displacement vector of [0.3, -0.4, 0]. You move each vertex 0.3 units in the x-direction, and -0.4 in the y-direction from its starting position.
raywenderlich.com
122
Metal by Tutorials
Chapter 5: 3D Transformations
In the following image, the blue arrows are vectors.
Vectors The left blue arrow is a vector with a value of [-1, 2]. The right blue arrow — the one near the cat — is also a vector with a value of [-1, 2]. Positions (points) are locations in space, whereas vectors are displacements in space. In other words, a vector contains the amount and direction to move. If you were to displace the cat by the blue vector, it would end up at point (2, 4). That’s the cat’s position (3, 2) plus the vector [-1, 2]. This 2D vector is a 1x2 matrix. It has one column and two rows. Note: You can order Matrices by rows or columns. Metal matrices are constructed in column-major order, which means that columns are contiguous in memory.
raywenderlich.com
123
Metal by Tutorials
Chapter 5: 3D Transformations
A matrix is a two-dimensional array. Even the single number 1 is a 1×1 matrix. In fact, the number 1 is unique in that when you multiply a number by 1, the answer is always that number. All square matrices — where the array width is the same as the array height — have a matrix with this same property. It’s called the identity matrix. Any vector or matrix multiplied by an identity matrix returns the same value. A 4×4 identity matrix looks like this (all zeros, except for the diagonal 1s):
An identity matrix A 3D transformation matrix has four rows and four columns. A transformation matrix holds scaling and rotation information in the upper left 3×3 matrix, with the translation information in the last column. When you multiply vectors and matrices, the number of columns of the left side matrix or vector must equal the number of rows of the right side. For example, you can’t multiply a float3 by a float4×4.
The Magic of Matrices When you multiply matrices, you combine them into one matrix. You can then multiply a vector by this matrix to transform the vector. For example, you can set up a rotation matrix and a translation matrix. You can then calculate the transformed position with the following line of code: translationMatrix * rotationMatrix * positionVector
Matrix multiplication goes from right to left. Here, the rotation is applied before the translation. This is a fundamental of linear algebra — and if you want to continue with computer graphics, you’ll need to understand linear algebra more fully. For now, understanding the concepts of setting up a transformation matrix can take you a long way.
raywenderlich.com
124
Metal by Tutorials
Chapter 5: 3D Transformations
Creating a Matrix ➤ Open Renderer.swift, and locate where you render the first gray triangle in draw(in:). ➤ Change the position code from: var position = simd_float3(0, 0, 0) renderEncoder.setVertexBytes( &position, length: MemoryLayout.stride, index: 11)
➤ To: var translation = matrix_float4x4() translation.columns.0 = [1, 0, 0, 0] translation.columns.1 = [0, 1, 0, 0] translation.columns.2 = [0, 0, 1, 0] translation.columns.3 = [0, 0, 0, 1] var matrix = translation renderEncoder.setVertexBytes( &matrix, length: MemoryLayout.stride, index: 11)
Here, you create an identity matrix and a render command to send to the GPU. ➤ Locate the position code for the second red triangle and change: position = simd_float3(0.3, -0.4, 0) renderEncoder.setVertexBytes( &position, length: MemoryLayout.stride, index: 11)
➤ To: let position = simd_float3(0.3, -0.4, 0) translation.columns.3.x = position.x translation.columns.3.y = position.y translation.columns.3.z = position.z matrix = translation renderEncoder.setVertexBytes( &matrix, length: MemoryLayout.stride, index: 11)
raywenderlich.com
125
Metal by Tutorials
Chapter 5: 3D Transformations
You’ll use this matrix to translate the position in the vertex shader. ➤ Open Shaders.metal, and change: constant float3 &position [[buffer(11)]])
➤ To: constant float4x4 &matrix [[buffer(11)]])
You receive the matrix into the shader. ➤ In the vertex function, change: float3 translation = in.position.xyz + position;
➤ To: float3 translation = in.position.xyz + matrix.columns[3].xyz;
You use the fourth column of the matrix as the displacement vector. ➤ Build and resume the preview. So far, the output is the same.
Translation by adding a matrix column to the position
raywenderlich.com
126
Metal by Tutorials
Chapter 5: 3D Transformations
Remember that this matrix is also going to hold rotation and scaling information, so to calculate the position, instead of adding the translation displacement vector, you’ll do matrix multiplication. ➤ Change the contents of the vertex function to: float4 translation = matrix * in.position; VertexOut out { .position = translation }; return out;
➤ Build and resume the preview, and you’ll see there’s still no change. You can now add scaling and rotation to the matrix in Renderer without having to change the shader function each time.
Scaling ➤ Open Renderer.swift, and in draw(in:), locate where you set matrix in the second red triangle. ➤ Before matrix = translation, add this: let scaleX: Float = 1.2 let scaleY: Float = 0.5 let scaleMatrix = float4x4( [scaleX, 0, 0, 0], [0, scaleY, 0, 0], [0, 0, 1, 0], [0, 0, 0, 1])
Without going into mathematics too much, you can use this code to set up a scale matrix. ➤ Change matrix = translation to: matrix = scaleMatrix
You multiply the translation matrix by the scale matrix instead of the translation matrix.
raywenderlich.com
127
Metal by Tutorials
Chapter 5: 3D Transformations
➤ Build and preview the app.
Scaling with a matrix In the vertex function, the matrix multiplies each vertex of the triangle by the x and y scales. ➤ Change matrix = scaleMatrix to: matrix = translation * scaleMatrix
This code translates the scaled triangle. ➤ Build and preview the app.
A translated and scaled triangle raywenderlich.com
128
Metal by Tutorials
Chapter 5: 3D Transformations
Rotation You perform rotation in a similar way to scaling. ➤ Change matrix = translation * scaleMatrix, to this: let angle = Float.pi / 2.0 let rotationMatrix = float4x4( [cos(angle), -sin(angle), 0, [sin(angle), cos(angle), 0, [0, 0, 1, [0, 0, 0,
0], 0], 0], 1])
matrix = rotationMatrix
Here, you set a rotation around the z-axis of the angle in radians. Note: Float.pi / 2.0 is the same as 90º, which is 1.5708 radians. A radian is the standard unit in computer graphics. This is the formula to convert degrees to radians: degrees * pi / 180 = radians. ➤ Build and preview, and you’ll see how each of the vertices of the red triangle are rotated by 90º around the origin [0, 0, 0].
Rotating about the origin
raywenderlich.com
129
Metal by Tutorials
Chapter 5: 3D Transformations
➤ Replace matrix = rotationMatrix with: matrix = translation * rotationMatrix * scaleMatrix
This code first scales each vertex, then rotates, then translates. ➤ Build and preview.
Scale, rotate and translate The order of matrix operations is important. Experiment with changing the order to see what happens. Scaling and rotation take place at the origin point (coordinates [0, 0, 0]). There may be times, however, that you want the rotation to take place around a different point. For example, let’s rotate the triangle around the right-most point of the triangle when it’s in its identity position (i.e., the same position and rotation as the gray triangle). To rotate the triangle, you’ll set up a translation matrix with the vector between the origin and the right-most point, taking the following steps: 1. Translate all the vertices using the translation matrix. 2. Rotate. 3. Translate back again.
raywenderlich.com
130
Metal by Tutorials
Chapter 5: 3D Transformations
➤ Before setting matrix in the red triangle, add this code: translation.columns.3.x = triangle.vertices[6] translation.columns.3.y = triangle.vertices[7] translation.columns.3.z = triangle.vertices[8]
You set the translation matrix to move to the third vertex of the triangle, which is the right-most point. Remember the steps. Step 1 is to translate all of the vertices by the distance from the origin. You can achieve this by setting a matrix to the vertex’s vector value and using the translate matrix’s inverse. Don’t forget to build and preview after each of the following steps so that you can see what the matrix multiplication is doing. ➤ Change matrix = translation * rotationMatrix * scaleMatrix to: matrix = translation.inverse
This code places the right-most vertex at the origin, translating all other vertices by the same amount.
Rotate about a point (1) ➤ Change the code you just entered to: matrix = rotationMatrix * translation.inverse
raywenderlich.com
131
Metal by Tutorials
Chapter 5: 3D Transformations
The triangle rotates by 90º around the origin.
Rotate about a point (2) ➤ Change the code you just entered to: matrix = translation * rotationMatrix * translation.inverse
Fantastic! You’re doing all of the steps of translating each vertex by the distance of the right-most vertex from the origin. After that, you’re rotating each vertex and translating it back again, causing the triangle to rotate around its right-most point.
Rotate about a point (3)
raywenderlich.com
132
Metal by Tutorials
Chapter 5: 3D Transformations
Key Points • A vector is a matrix with only one row or column. • By combining three matrices for translation, rotation and scale, you can position a model anywhere in the scene. • In the resources folder for this chapter, references.markdown suggests further reading to help better understand transformations with linear algebra.
raywenderlich.com
133
6
Chapter 6: Coordinate Spaces
To easily find a point on a grid, you need a coordinate system. For example, if the grid happens to be your iPhone 13 screen, the center point might be x: 195, y: 422. However, that point may be different depending on what space it’s in. In the previous chapter, you learned about matrices. By multiplying a vertex’s position by a particular matrix, you can convert the vertex position to a different coordinate space. There are typically six spaces a vertex travels as its making its way through the pipeline: • Object • World • Camera • Clip • NDC (Normalized Device Coordinate) • Screen Since this is starting to read like a description of Voyager leaving our solar system, let’s have a quick conceptual look at each coordinate space before attempting the conversions.
raywenderlich.com
134
Metal by Tutorials
Chapter 6: Coordinate Spaces
Object Space If you’re familiar with the Cartesian coordinate system, you know that it uses two points to map an object’s location. The following image shows a 2D grid with the possible vertices of the dog mapped using Cartesian coordinates.
Vertices in object space The positions of the vertices are in relation to the dog’s origin, which is located at (0, 0). The vertices in this image are located in object space (or local or model space). In the previous chapter, Triangle held an array of vertices in object space, describing the vertex of each point of the triangle.
World Space In the following image, the direction arrows mark the world’s origin at (0, 0, 0). So, in world space, the dog is at (1, 0, 1) and the cat is at (-1, 0, -2).
Vertices in world space raywenderlich.com
135
Metal by Tutorials
Chapter 6: Coordinate Spaces
Of course, we all know that cats always see themselves at the center of the universe, so, naturally, the cat is located at (0, 0, 0) in cat space. This cat space location makes the dog’s position, (2, 0, 3), relative to the cat. When the cat moves around in his cat space, he remains at (0, 0, 0), while the position of everything else changes relative to the cat. Note: Cat space is not recognized as a traditional 3D coordinate space, but mathematically, you can create your own space and use any position in the universe as the origin. Every other point in the universe is now relative to that origin. In a later chapter, you’ll discover other spaces besides the ones described here.
Camera Space Enough about the cat. Let’s move on to the dog. For him, the center of the universe is the person holding the camera. So, in camera space (or view space), the camera is at (0, 0, 0) and the dog is approximately at (-3, -2, 7). When the camera moves, it stays at (0, 0, 0), but the positions of the dog and cat move relative to the camera.
Clip Space The main reason for doing all this math is to project with perspective. In other words, you want to take a three-dimensional scene into a two-dimensional space. Clip space is a distorted cube that’s ready for flattening.
Clip space
raywenderlich.com
136
Metal by Tutorials
Chapter 6: Coordinate Spaces
In this scene, the dog and the cat are the same size, but the dog appears smaller because of its location in 3D space. Here, the dog is farther away than the cat, so he looks smaller. Note: You could use orthographic or isometric projection instead of perspective projection, which comes in handy if you’re rendering engineering drawings.
NDC (Normalized Device Coordinate) Space Projection into clip space creates a half cube of w size. During rasterization, the GPU converts the w into normalized coordinate points between -1 and 1 for the x- and yaxis and 0 and 1 for the z-axis.
Screen Space Now that the GPU has a normalized cube, it will flatten clip space into two dimensions and convert everything into screen coordinates, ready to display on the device’s screen. In this final image, the flattened scene has perspective — with the dog being smaller than the cat, indicating the distance between the two.
Final render
raywenderlich.com
137
Metal by Tutorials
Chapter 6: Coordinate Spaces
Converting Between Spaces To convert from one space to another, you can use transformation matrices. In the following image, the vertex on the dog’s ear is (-1, 4, 0) in object space. But in world space, the origin is different, so the vertex — judging from the image — is at about (0.75, 1.5, 1).
Converting object to world To change the dog vertex positions from object space to world space, you can translate (move) them using a transformation matrix. Since you control four spaces, you have access to three corresponding matrices:
The three transformation matrices • Model matrix: between object and world space • View matrix: between world and camera space • Projection matrix: between camera and clip space
Coordinate Systems Different graphics APIs use different coordinate systems. You already found out that Metal’s NDC (Normalized Device Coordinates) uses 0 to 1 on the z-axis. You also may already be familiar with OpenGL, which uses 1 to -1 on the z-axis.
raywenderlich.com
138
Metal by Tutorials
Chapter 6: Coordinate Spaces
In addition to being different sizes, OpenGL’s z-axis points in the opposite direction from Metal’s z-axis. That’s because OpenGL’s system is a right-handed coordinate system, and Metal’s system is a left-handed coordinate system. Both systems use x to the right and y as up. Blender uses a different coordinate system, where z is up, and y is into the screen.
Coordinate systems If you’re consistent with your coordinate system and create matrices accordingly, it doesn’t matter what coordinate system you use. In this book, we’re using Metal’s left-handed coordinate system, but we could have used a right-handed coordinate system with different matrix creation methods instead.
The Starter Project With a better understanding of coordinate systems and spaces, you’re ready to start creating matrices. ➤ In Xcode, open the starter project for this chapter and either set up the SwiftUI preview (as described in the previous chapter) or build and run the app. Note that the width constraint is removed, so you may have to zoom out the preview.
raywenderlich.com
139
Metal by Tutorials
Chapter 6: Coordinate Spaces
The project is similar to the playground you set up in Chapter 2, “3D Models”, where you rendered train.usd.
Starter project MathLibrary.swift — located in the Utility group — contains methods that are extensions on float4x4 for creating the translation, scale and rotation matrices. This file also contains typealiases for float2/3/4, so you don’t have to type simd_float2/3/4. Model.swift contains the model initialization and loading code. This file also contains render(encoder:) — called from Renderer’s draw(in:) — which renders the model. VertexDescriptor.swift creates a default MDLVertexDescriptor. The default MTKVertexDescriptor is derived from this descriptor. When using Model I/O to load models with vertex descriptors, the code can get a bit lengthy. Rather than creating a MetalKit MTKVertexDescriptor, it’s easier to create a Model I/O MDLVertexDescriptor and then convert to the MTKVertexDescriptor that the pipeline state object needs using MTKMetalVertexDescriptorFromModelIO(_:). If you examine the vertex descriptor code from the previous chapter, the same process is used for both vertex descriptors. You describe attributes and layouts. At the moment, your train: • Takes up the entire width of the screen. • Has no depth perspective. • Stretches to fit the size of the application window.
raywenderlich.com
140
Metal by Tutorials
Chapter 6: Coordinate Spaces
You can decouple the train’s vertex positions from the window size by taking the train into other coordinate spaces. The vertex function is responsible for converting the model vertices through these various coordinate spaces, and that’s where you’ll perform the matrix multiplications that do the conversions between different spaces.
Uniforms Constant values that are the same across all vertices or fragments are generally referred to as uniforms. The first step is to create a uniform structure to hold the conversion matrices. After that, you’ll apply the uniforms to every vertex. Both the shaders and the code on the Swift side will access these uniform values. If you were to create a struct in Renderer and a matching struct in Shaders.metal, there’s a better chance you’ll forget to keep them synchronized. Therefore, the best approach is to create a bridging header that both C++ and Swift can access. You’ll do that now: ➤ Using the macOS Header File template, create a new file in the Shared group and name it Common.h. ➤ In the Project navigator, click the main Spaces project folder. ➤ Select the project Spaces, and then select Build Settings along the top. Make sure All and Combined are highlighted. ➤ In the search bar, type bridg to filter the settings. Double-click the Objective-C Bridging Header value and enter Shared/Common.h.
Setting up the bridging header This configuration tells Xcode to use this file for both the C++ derived Metal Shading Language and Swift. raywenderlich.com
141
Metal by Tutorials
Chapter 6: Coordinate Spaces
➤ In Common.h before the final #endif, add the following code: #import
This code imports the simd framework, which provides types and functions for working with vectors and matrices. ➤ Next, add the uniforms structure: typedef struct { matrix_float4x4 modelMatrix; matrix_float4x4 viewMatrix; matrix_float4x4 projectionMatrix; } Uniforms;
These three matrices — each with four rows and four columns — will hold the necessary conversion between the spaces.
The Model Matrix Your train vertices are currently in object space. To convert these vertices to world space, you’ll use modelMatrix. By changing modelMatrix, you’ll be able to translate, scale and rotate your train. ➤ In Renderer.swift, add the new structure to Renderer: var uniforms = Uniforms()
You defined Uniforms in Common.h (the bridging header file), so Swift is able to recognize the Uniforms type. ➤ At the bottom of init(metalView:), add: let translation = float4x4(translation: [0.5, -0.4, 0]) let rotation = float4x4(rotation: [0, 0, Float(45).degreesToRadians]) uniforms.modelMatrix = translation * rotation
Here, you set modelMatrix to have a translation of 0.5 units to the right, 0.4 units down and a counterclockwise rotation of 45 degrees. ➤ In draw(in:) before model.render(encoder: renderEncoder), add this: renderEncoder.setVertexBytes( &uniforms, length: MemoryLayout.stride,
raywenderlich.com
142
Metal by Tutorials
Chapter 6: Coordinate Spaces
index: 11)
This code sets up the uniform matrix values on the Swift side. ➤ Open Shaders.metal, and import the bridging header file after setting the namespace: #import "Common.h"
➤ Change the vertex function to: vertex VertexOut vertex_main( VertexIn in [[stage_in]], constant Uniforms &uniforms [[buffer(11)]]) { float4 position = uniforms.modelMatrix * in.position; VertexOut out { .position = position }; return out; }
Here, you receive the Uniforms structure as a parameter, and then you multiply all of the vertices by the model matrix. ➤ Build and preview the app.
Train in world space In the vertex function, you multiply the vertex position by the model matrix. All of the vertices are rotated then translated. The train vertex positions still relate to the width of the screen, so the train looks stretched. You’ll fix that momentarily.
raywenderlich.com
143
Metal by Tutorials
Chapter 6: Coordinate Spaces
View Matrix To convert between world space and camera space, you set a view matrix. Depending on how you want to move the camera in your world, you can construct the view matrix appropriately. The view matrix you’ll create here is a simple one, best for FPS (First Person Shooter) style games. ➤ In Renderer.swift at the end of init(metalView:), add this code: uniforms.viewMatrix = float4x4(translation: [0.8, 0, 0]).inverse
Remember that all of the objects in the scene should move in the opposite direction to the camera. inverse does an opposite transformation. So, as the camera moves to the right, everything in the world appears to move 0.8 units to the left. With this code, you set the camera in world space, and then you add .inverse so that the objects will react in inverse relation to the camera. ➤ In Shaders.metal, change: float4 position = uniforms.modelMatrix * in.position;
➤ To: float4 position = uniforms.viewMatrix * uniforms.modelMatrix * in.position;
➤ Build and preview the app.
Train in camera space
raywenderlich.com
144
Metal by Tutorials
Chapter 6: Coordinate Spaces
The train moves 0.8 units to the left. Later, you’ll be able to navigate through a scene using the keyboard, and just changing the view matrix will update all of the objects in the scene around the camera. The last matrix you’ll set will prepare the vertices to move from camera space to clip space. This matrix will also allow you to use unit values instead of the -1 to 1 NDC (Normalized Device Coordinates) that you’ve been using. To demonstrate why this is necessary, you’ll add some animation to the train and rotate it on the y-axis. ➤ Open Renderer.swift, and in draw(in:), just above the following code: renderEncoder.setVertexBytes( &uniforms, length: MemoryLayout.stride, index: 1)
➤ Add this code: timer += 0.005 uniforms.viewMatrix = float4x4.identity let translationMatrix = float4x4(translation: [0, -0.6, 0]) let rotationMatrix = float4x4(rotationY: sin(timer)) uniforms.modelMatrix = translationMatrix * rotationMatrix
Here, you reset the camera view matrix and replace the model matrix with a rotation around the y-axis. ➤ Build and preview the app.
A clipped train
raywenderlich.com
145
Metal by Tutorials
Chapter 6: Coordinate Spaces
You can see that when the train rotates, any vertices greater than 1.0 on the z-axis are clipped. Any vertex outside Metal’s NDC will be clipped.
NDC clipping
Projection It’s time to apply some perspective to your render to give your scene some depth. The following diagram shows a 3D scene. At the bottom-right, you can see how the rendered scene will appear.
Projection of a scene raywenderlich.com
146
Metal by Tutorials
Chapter 6: Coordinate Spaces
When you render a scene, you need to consider: • How much of that scene will fit on the screen. Your eyes have a field of view of about 200º, and within that field of view, your computer screen takes up about 70º. • How far you can see by having a far plane. Computers can’t see to infinity. • How close you can see by having a near plane. • The aspect ratio of the screen. Currently, your train changes size when the screen size changes. When you take into account the width and height ratio, this won’t happen. The image above shows all these things. The shape created from the near to the far plane is a cut-off pyramid called a frustum. Anything in your scene that’s located outside the frustum will not render. Compare the rendered image again to the scene setup. The rat in the scene won’t render because he’s in front of the near plane. MathLibrary.swift provides a projection method that returns the matrix to project objects within this frustum into clip space, ready for conversion to NDC coordinates.
Projection Matrix ➤ Open Renderer.swift, and add this code to mtkView(_:drawableSizeWillChange:): let aspect = Float(view.bounds.width) / Float(view.bounds.height) let projectionMatrix = float4x4( projectionFov: Float(45).degreesToRadians, near: 0.1, far: 100, aspect: aspect) uniforms.projectionMatrix = projectionMatrix
This delegate method gets called whenever the view size changes. Because the aspect ratio will change, you must reset the projection matrix. You’re using a field of view of 45º; a near plane of 0.1, and a far plane of 100 units.
raywenderlich.com
147
Metal by Tutorials
Chapter 6: Coordinate Spaces
➤ At the end of init(metalView:), add this: mtkView( metalView, drawableSizeWillChange: metalView.bounds.size)
This code ensures that you set up the projection matrix at the start of the app. ➤ In the vertex function of Shaders.metal, change the position matrix calculation to: float4 position = uniforms.projectionMatrix * uniforms.viewMatrix * uniforms.modelMatrix * in.position;
➤ Build and preview the app.
Zoomed in Because of the projection matrix, the z-coordinates measure differently now, so you’re zoomed in on the train. ➤ In Renderer.swift in draw(in:), replace: uniforms.viewMatrix = float4x4.identity
➤ With: uniforms.viewMatrix = float4x4(translation: [0, 0, -3]).inverse
raywenderlich.com
148
Metal by Tutorials
Chapter 6: Coordinate Spaces
This moves the camera back into the scene by three units. Build and preview:
Camera moved back ➤ In mtkView(_:drawableSizeWillChange:), change the projection matrix’s projectionFOV parameter to 70º, then build and preview the app.
A greater field of view The train appears smaller because the field of view is wider, and more objects horizontally can fit into the rendered scene. Note: Experiment with the projection values and the model transformation. In draw(in:), set translationMatrix’s z translation value to a distance of 97, and the front of the train is just visible. At z = 98, the train is no longer visible. The projection far value is 100 units, and the camera is back 3 units. If you change the projection’s far parameter to 1000, the train is visible again.
raywenderlich.com
149
Metal by Tutorials
Chapter 6: Coordinate Spaces
➤ To render a solid train, in draw(in:), remove: renderEncoder.setTriangleFillMode(.lines)
The train positioned in a scene
Perspective Divide Now that you’ve converted your vertices from object space through world space, camera space and clip space, the GPU takes over to convert to NDC coordinates (that’s -1 to 1 in the x and y directions and 0 to 1 in the z direction). The ultimate aim is to scale all the vertices from clip space into NDC space, and by using the fourth w component, that task gets a lot easier. To scale a point, such as (1, 2, 3), you can have a fourth component: (1, 2, 3, 3). Divide by that last w component to get (1/3, 2/3, 3/3, 1). The xyz values are now scaled down. These coordinates are known as homogeneous, which means of the same kind. The projection matrix projected the vertices from a frustum to a cube in the range -w to w. After the vertex leaves the vertex function along the pipeline, the GPU performs a perspective divide and divides the x, y and z values by their w value. The higher the w value, the further back the coordinate is. The result of this calculation is that all visible vertices will now be within NDC. Note: To avoid a divide by zero, the projection near plane should always be a value slightly more than zero.
raywenderlich.com
150
Metal by Tutorials
Chapter 6: Coordinate Spaces
The w value is the main difference between a float4 vector direction and a float4 position. Because of the perspective divide, the position must have a value, generally 1, in w. Whereas a vector should have 0 in the w value as it doesn’t go through the perspective divide. In the following picture, the dog and cat are the same height — perhaps a y value of 2, for example. With projection, since the dog is farther back, it should appear smaller in the final render.
The dog should appear smaller. After projection, the cat might have a w value of ~1, and the dog might have a w value of ~8. Dividing by w would give the cat a height of 2 and the dog a height of 1/4, which will make the dog appear smaller.
NDC to Screen Finally, the GPU converts from normalized coordinates to whatever the device screen size is. You may already have done something like this at some time in your career when converting between normalized coordinates and screen coordinates. To convert Metal NDC (Normalized Device Coordinates), which are between -1 and 1 to a device, you can use something like this: converted.x = point.x * screenWidth/2 + screenWidth/2 converted.y = point.y * screenHeight/2 + screenHeight/2
However, you can also do this with a matrix by scaling half the screen size and translating by half the screen size. The clear advantage of this method is that you can set up a transformation matrix once and multiply any normalized point by the matrix to convert it into the correct screen space using code like this: converted = matrix * point
The rasterizer on the GPU takes care of the matrix calculation for you. raywenderlich.com
151
Metal by Tutorials
Chapter 6: Coordinate Spaces
Refactoring the Model Matrix Currently, you set all the matrices in Renderer. Later, you’ll create a Camera structure to calculate the view and projection matrices. For the model matrix, rather than updating it directly, any object that you can move — such as a model or a camera — can hold a position, rotation and scale. From this information, you can construct the model matrix. ➤ Create a new Swift file named Transform.swift ➤ Add the new structure: struct Transform { var position: float3 = [0, 0, 0] var rotation: float3 = [0, 0, 0] var scale: Float = 1 }
This struct will hold the transformation information for any object that you can move. ➤ Add an extension with a computed property: extension Transform { var modelMatrix: matrix_float4x4 { let translation = float4x4(translation: position) let rotation = float4x4(rotation: rotation) let scale = float4x4(scaling: scale) let modelMatrix = translation * rotation * scale return modelMatrix } }
This code automatically creates a model matrix from any transformable object. ➤ Add a new protocol so that you can mark objects as transformable: protocol Transformable { var transform: Transform { get set } }
➤ Because it’s a bit longwinded to type model.transform.position, add a new extension to Transformable: extension Transformable { var position: float3 {
raywenderlich.com
152
Metal by Tutorials
Chapter 6: Coordinate Spaces
get { transform.position } set { transform.position = newValue }
}
} var rotation: float3 { get { transform.rotation } set { transform.rotation = newValue } } var scale: Float { get { transform.scale } set { transform.scale = newValue } }
This code provides computed properties to allow you to use model.position directly, and the model’s transform will update from this value. ➤ Open Model.swift, and mark Model as Transformable. class Model: Transformable {
➤ Add the new transform property to Model: var transform = Transform()
➤ Open Renderer.swift, and from init(metalView:), remove: let translation = float4x4(translation: [0.5, -0.4, 0]) let rotation = float4x4(rotation: [0, 0, Float(45).degreesToRadians]) uniforms.modelMatrix = translation * rotation
raywenderlich.com
153
Metal by Tutorials
Chapter 6: Coordinate Spaces
➤ In draw(in:), replace: let translationMatrix = float4x4(translation: [0, -0.6, 0]) let rotationMatrix = float4x4(rotationY: sin(timer)) uniforms.modelMatrix = translationMatrix * rotationMatrix
➤ With: model.position.y = -0.6 model.rotation.y = sin(timer) uniforms.modelMatrix = model.transform.modelMatrix
➤ Build and preview the app.
Using a transform in Model The result is exactly the same, but the code is much easier to read — and changing a model’s position, rotation and scale is more accessible. Later, you’ll extract this code into a GameScene so that Renderer is left only to render models rather than manipulate them.
raywenderlich.com
154
Metal by Tutorials
Chapter 6: Coordinate Spaces
Key Points • Coordinate spaces map different coordinate systems. To convert from one space to another, you can use matrix multiplication. • Model vertices start off in object space. These are generally held in the file that comes from your 3D app, such as Blender, but you can procedurally generate them too. • The model matrix converts object space vertices to world space. These are the positions that the vertices hold in the scene’s world. The origin at [0, 0, 0] is the center of the scene. • The view matrix moves vertices into camera space. Generally, your matrix will be the inverse of the position of the camera in world space. • The projection matrix applies three-dimensional perspective to your vertices.
Where to Go From Here? You’ve covered a lot of mathematical concepts in this chapter without diving too far into the underlying mathematical principles. To get started in computer graphics, you can fill your transform matrices and continue multiplying them at the usual times, but to be sufficiently creative, you’ll need to understand some linear algebra. A great place to start is Grant Sanderson’s Essence of Linear Algebra at https://bit.ly/ 3iYnkN1. This video treats vectors and matrices visually. You’ll also find some additional references in references.markdown in the resources folder for this chapter.
raywenderlich.com
155
7
Chapter 7: The Fragment Function
Knowing how to render triangles, lines and points by sending vertex data to the vertex function is a pretty neat skill to have — especially since you’re able to create shapes using only simple, one-line fragment functions. However, fragment shaders are capable of doing a lot more. ➤ Open the website https://shadertoy.com, where you’ll find a dizzying number of brilliant community-created shaders.
shadertoy.com examples These examples may look like renderings of complex 3D models, but looks are deceiving! Every “model” you see here is entirely generated using mathematics, written in a GLSL fragment shader. GLSL is the Graphics Library Shading Language for OpenGL — and in this chapter, you’ll begin to understand the principles that all shading masters use. raywenderlich.com
156
Metal by Tutorials
Chapter 7: The Fragment Function
Note: Every graphics API uses its own shader language. The principles are the same, so if you find a GLSL shader you like, you can recreate it in Metal’s MSL.
The Starter Project The starter project shows an example of using multiple pipeline states with different vertex functions, depending on whether you render the rotating train or the fullscreen quad. ➤ Open the starter project for this chapter. ➤ Build and run the project. (You can choose to render the train or the quad. You’ll start with the quad first.)
The starter project Let’s have a closer look at code. ➤ Open Vertex.metal, and you’ll see two vertex functions: • vertex_main: This function renders the train, just as it did in the previous chapter. • vertex_quad: This function renders the full-screen quad using an array defined in the shader.
raywenderlich.com
157
Metal by Tutorials
Chapter 7: The Fragment Function
Both functions output a VertexOut, containing only the vertex’s position. ➤ Open Renderer.swift. In init(metalView:options:), you’ll see two pipeline state objects — one for each of the two vertex functions. Depending on the value of options.renderChoice, draw(in:) renders either the train model or the quad. SwiftUI views handle updating Options. If you prefer previews to building and running the project each time, change the initial rendering value in Options.swift. ➤ Ensure you understand how this project works before you continue.
Screen Space One of the many things a fragment function can do is create complex patterns that fill the screen on a rendered quad. At the moment, the fragment function has only the interpolated position output from the vertex function available to it. So first, you’ll learn what you can do with this position and what its limitations are. ➤ Open Fragment.metal, and change the fragment function contents to: float color; in.position.x < 200 ? color = 0 : color = 1; return float4(color, color, color, 1);
Here, the rasterizer interpolates in.position and converts it into screen space. You defined the width of the Metal view in ContentView.swift as 400 points. With the newly added code, you say that if the x position is less than 200, make the color black. Otherwise, make the color white. Note: Although you can use an if statement, the compiler optimizes the ternary statement better, so it makes more sense to use that instead.
raywenderlich.com
158
Metal by Tutorials
Chapter 7: The Fragment Function
➤ Build and preview the change.
MacBook Pro vs iPhone 12 Pro Max Did you expect half the screen to be black? The view is 400 points wide, so it would make sense. But there’s something you might not have considered: Apple Retina displays have varying pixel resolutions or pixel densities. For example, a MacBook Pro has a 2x Retina display, whereas the iPhone 12 Pro Max has a 3x Retina display. These varying displays mean that the 400 point Metal view on a MacBook Pro creates an 800x800 pixel drawable texture and the iPhone view creates a 1200x1200 pixel drawable texture. Your quad fills up the screen, and you’re writing to the view’s drawable render target texture (the size of which matches the device’s display), but there’s no easy way to find out in the fragment function what size the current render target texture is. ➤ Open Common.h, and add a new structure: typedef struct { uint width; uint height; } Params;
This code holds parameters that you can send to the fragment function. You can add parameters to this structure as you need them. ➤ Open Renderer.swift, and add a new property to Renderer: var params = Params()
raywenderlich.com
159
Metal by Tutorials
Chapter 7: The Fragment Function
You’ll store the current render target size in the new property. ➤ Add the following code to the end of mtkView(_:drawableSizeWillChange:): params.width = UInt32(size.width) params.height = UInt32(size.height) size contains the drawable texture size of the view. In other words, the view’s
bounds scaled by the device’s scale factor. ➤ In draw(in:), before calling the methods to render the model or quad, send the parameters to the fragment function: renderEncoder.setFragmentBytes( ¶ms, length: MemoryLayout.stride, index: 12)
Notice that you’re using setFragmentBytes(_:length:index:) to send data to the fragment function the same way you previously used setVertexBytes(_:length:index:). ➤ Open Fragment.metal, and change the signature of fragment_main to: fragment float4 fragment_main( constant Params ¶ms [[buffer(12)]], VertexOut in [[stage_in]]) Params with the target drawing texture size is now available to the fragment
function. ➤ Change the code that sets the value of color — based on the value of in.position.x — to: in.position.x < params.width * 0.5 ? color = 0 : color = 1;
Here, you’re using the target render size for the calculation. ➤ Preview the app in both macOS and the Retina x3 device, iPhone 12 Pro Max.
raywenderlich.com
160
Metal by Tutorials
Chapter 7: The Fragment Function
Fantastic, the render now looks the same for both devices.
Corrected for retina devices
Metal Standard Library Functions In addition to standard mathematical functions such as sin, abs and length, there are a few other useful functions. Let’s have a look.
step step(edge, x) returns 0 if x is less than edge. Otherwise, it returns 1. This
evaluation is exactly what you’re doing with your current fragment function. ➤ Replace the contents of the fragment function with: float color = step(params.width * 0.5, in.position.x); return float4(color, color, color, 1);
This code produces the same result as before but with slightly less code. ➤ Build and run.
step raywenderlich.com
161
Metal by Tutorials
Chapter 7: The Fragment Function
The result is that you get black on the left where the result of step is 0, and white on the right where the result of step is 1. Let’s take this further with a checkerboard pattern. ➤ Replace the contents of the fragment function with: uint checks = 8; // 1 float2 uv = in.position.xy / params.width; // 2 uv = fract(uv * checks * 0.5) - 0.5; // 3 float3 color = step(uv.x * uv.y, 0.0); return float4(color, 1.0);
Here’s what’s happening: 1. UV coordinates form a grid with values between 0 and 1. The center of the grid is at [0.5, 0.5]. UV coordinates are most often associated with mapping vertices to textures, as you’ll see in Chapter 8, “Textures”. 2. fract(x) returns the fractional part of x. You take the fractional value of the UVs multiplied by half the number of checks, which gives you a value between 0 and 1. To center the UVs, you subtract 0.5. 3. If the result of the xy multiplication is less than zero, then the result is 0 or black. Otherwise, it’s 1 or white. ➤ Build and run the app.
Checker board
raywenderlich.com
162
Metal by Tutorials
Chapter 7: The Fragment Function
length Creating squares is a lot of fun, but let’s create some circles using a length function. ➤ Replace the fragment function with: float center = 0.5; float radius = 0.2; float2 uv = in.position.xy / params.width - center; float3 color = step(length(uv), radius); return float4(color, 1.0);
➤ Build and run the app.
Circle To resize and move the shape around the screen, you change the circle’s center and radius.
smoothstep smoothstep(edge0, edge1, x) returns a smooth Hermite interpolation between 0 and 1.
Note: edge1 must be greater than edge0, and x should be edge0 MTLTexture? { guard let property = material?.property(with: semantic), property.type == .string, let filename = property.stringValue, let texture = TextureController.texture(filename: filename) else { return nil } return texture }
raywenderlich.com
187
Metal by Tutorials
}
}
Chapter 8: Textures
baseColor = property(with: MDLMaterialSemantic.baseColor)
property(with:) looks up the provided property in the submesh’s material, finds
the filename string value of the property and returns a texture if there is one. Remember, there was another material property in the file marked Kd. That was the base color using floats. Material properties can also be float values where there is no texture available for the submesh. This loads the base color texture with the submesh’s material. Here, Base color means the same as diffuse. Later, you’ll load other textures for the submesh in the same way. ➤ At the bottom of init(mdlSubmesh:mtkSubmesh) add: textures = Textures(material: mdlSubmesh.material)
This code completes the initialization and removes the compiler warning. ➤ Build and run your app to check that everything’s working. Your model will look the same as in the initial screenshot. However, you’ll get a message in the console:
The texture loader has successfully loaded lowpoly-house-color.png.
2. Passing the Loaded Texture to the Fragment Function In a later chapter, you’ll learn about several other texture types and how to send them to the fragment function using different indices. ➤ Open Common.h, and add a new enumeration to keep track of these texture buffer index numbers: typedef enum { BaseColor = 0 } TextureIndices;
➤ Open VertexDescriptor.swift, and add this code to the end of the file: extension TextureIndices { var index: Int {
raywenderlich.com
188
Metal by Tutorials
}
}
Chapter 8: Textures
return Int(self.rawValue)
This code allows you to use BaseColor.index instead of Int(BaseColor.rawValue)). A small touch, but it makes your code easier to read. ➤ Open Model.swift. In render(encoder:uniforms:params:) where you process the submeshes, add the following code below the comment // set the fragment texture here: encoder.setFragmentTexture( submesh.textures.baseColor, index: BaseColor.index)
You’re now passing the texture to the fragment function in texture buffer 0. Note: Buffers, textures and sampler states are held in argument tables. As you’ve seen, you access these things by index numbers. On iOS, you can hold at least 31 buffers and textures, and 16 sampler states in the argument table; the number of textures on macOS increases to 128. You can find out feature availability for your device in Apple’s Metal Feature Set Tables (https:// apple.co/2UpCT8r).
3. Updating the Fragment Function ➤ Open Shaders.metal, and add the following new argument to fragment_main, immediately after VertexOut in [[stage_in]],: texture2d baseColorTexture [[texture(BaseColor)]]
You’re now able to access the texture on the GPU. ➤ Replace all the code in fragment_main with: constexpr sampler textureSampler;
When you read or sample the texture, you may not land precisely on a particular pixel. In texture space, the units that you sample are known as texels, and you can decide how each texel is processed using a sampler. You’ll learn more about samplers shortly.
raywenderlich.com
189
Metal by Tutorials
Chapter 8: Textures
➤ Next, add this: float3 baseColor = baseColorTexture.sample( textureSampler, in.uv).rgb; return float4(baseColor, 1);
Here, you sample the texture using the interpolated UV coordinates sent from the vertex function, and you retrieve the RGB values. In Metal Shading Language, you can use rgb to address the float elements as an equivalent of xyz. You then return the texture color from the fragment function. ➤ Build and run the app to see your textured house.
The textured house
sRGB Color Space You’ll notice that the rendered texture looks much darker than the original image. This change in color happens because lowpoly-house-color.png is an sRGB texture. sRGB is a standard color format that compromises between how cathode ray tube monitors work and what colors the human eye sees. As you can see in the following example of grayscale values from 0 to 1, sRGB colors are not linear. Humans are more able to discern between lighter values than darker ones.
raywenderlich.com
190
Metal by Tutorials
Chapter 8: Textures
Unfortunately, it’s not easy to do the math on colors in a non-linear space. If you multiply a color by 0.5 to darken it, the difference in sRGB will vary along the scale. You’re currently loading the texture as sRGB pixel data and rendering it into a linear color space. So when you’re sampling a value of, say 0.2, which in sRGB space is mid-gray, the linear space will read that as dark-gray. To approximately convert the color, you can use the inverse of gamma 2.2: sRGBcolor = pow(linearColor, 1.0/2.2);
If you use this formula on baseColor before returning from the fragment function, your house texture will look about the same as the original sRGB texture. However, a better way of dealing with this problem is not to load the texture as sRGB at all. ➤ Open TextureController.swift, and in loadTexture(filename:), locate: let textureLoaderOptions: [MTKTextureLoader.Option: Any] = [.origin: MTKTextureLoader.Origin.bottomLeft]
➤ Change it to: let textureLoaderOptions: [MTKTextureLoader.Option: Any] = [ .origin: MTKTextureLoader.Origin.bottomLeft, .SRGB: false ]
➤ Build and run the app, and the texture now loads with the linear color pixel format bgra8Unorm.
Linear workflow
raywenderlich.com
191
Metal by Tutorials
Chapter 8: Textures
Note: An alternative to loading the textures with SRGB as false is to change the MTKView‘s colorPixelFormat to bgra8Unorm_srgb. This change will affect the view’s color space, and the clear color background will also change. You’ll find further reading on chromaticity and color in references.markdown in the resources folder for this chapter.
Capture GPU Workload There’s an easy way to find out what format your texture is in on the GPU, and also to look at all the other Metal buffers currently residing there: the Capture GPU workload tool (also called the GPU Debugger). ➤ Run your app, and at the bottom of the Xcode window (or above the debug console if you have it open), click the M Metal icon, change the number of frames to count to 1, and click Capture in the pop-up window:
raywenderlich.com
192
Metal by Tutorials
Chapter 8: Textures
This button captures the current GPU frame. On the left in the Debug navigator, you’ll see the GPU trace:
A GPU trace Note: To open or close all items in a hierarchy, you can Option-click the arrow.
raywenderlich.com
193
Metal by Tutorials
Chapter 8: Textures
You can see all the commands that you’ve given to the render command encoder, such as setFragmentBytes and setRenderPipelineState. Later, when you have several command encoders, you’ll see each one of them listed, and you can select them to see what actions or textures they have produced from their encoding. When you select drawIndexedPrimitives, the Vertex and Fragment resources show.
Resources on the GPU ➤ Double-click each vertex resource to see what’s in the buffer: • MDL_OBJ-Indices: The vertex indices. • Buffer 0: The vertex position and normal data, matching the attributes of your VertexIn struct and the vertex descriptor. • Buffer 1: The UV texture coordinate data. • Vertex Bytes: The uniform matrices. • Vertex Attributes: The incoming data from VertexIn, and the VertexOut return data from the vertex function. • vertex_main: The vertex function. When you have multiple vertex functions, this is very useful to make sure that you set the correct pipeline state. raywenderlich.com
194
Metal by Tutorials
Chapter 8: Textures
Going through the fragment resources: • lowpoly-house-color.png: The house texture in texture slot 0. • Fragment Bytes: The width and height screen parameters in params. • fragment_main: The fragment function. The attachments: • CAMetalLayer Drawable: The result of the encoding in color attachment 0. In this case, this is the view’s current drawable. Later, you’ll use multiple color attachments. • MTKView Depth: The depth buffer. Black is closer. White is farther. The rasterizer uses the depth map. You can see from this list that the GPU is holding the lowpoly-house-color.png texture as BGRA8Unorm. If you reverse the previous section’s texture loading options and comment out .SRGB: false, you’ll be able to see that the texture is now BGRA8Unorm_sRGB. (Make sure you restore the option .SRGB: false before continuing.) If you’re ever uncertain as to what is happening in your app, capturing the GPU frame might give you the heads-up because you can examine every render encoder command and every buffer. It’s a good idea to use this strategy throughout this book to examine what’s happening on the GPU.
Samplers When sampling your texture in the fragment function, you use a default sampler. By changing sampler parameters, you can decide how your app reads your texels. You’ll now add a ground plane to your scene to see how you can control the appearance of the ground texture. ➤ Open Renderer.swift, and add a new property: lazy var ground: Model = { Model(name: "plane.obj") }()
raywenderlich.com
195
Metal by Tutorials
Chapter 8: Textures
In draw(in:) after rendering the house and before renderEncoder.endEncoding(), add: ground.scale = 40 ground.rotation.y = sin(timer) ground.render( encoder: renderEncoder, uniforms: uniforms, params: params)
This code adds a ground plane and scales it up. ➤ Build and run the app.
A stretched texture The ground texture stretches to fit the ground plane, and each pixel in the texture may be used by several rendered fragments, giving it a pixellated look. By changing one of the sampler parameters, you can tell Metal how to process the texel where it’s smaller than the assigned fragments. ➤ Open Shaders.metal. In fragment_main, change the textureSampler definition to: constexpr sampler textureSampler(filter::linear);
This code instructs the sampler to smooth the texture.
raywenderlich.com
196
Metal by Tutorials
Chapter 8: Textures
➤ Build and run the app.
A smoothed texture The ground texture — although still stretched — is now smooth. There will be times, such as when you make a retro game of Frogger, that you’ll want to keep the pixelation. In that case, use nearest filtering.
Filtering In this particular case, however, you want to tile the texture. That’s easy with sampling. ➤ Change the sampler definition and the baseColor assignment to: constexpr sampler textureSampler( filter::linear, address::repeat); float3 baseColor = baseColorTexture.sample( textureSampler, in.uv * 16).rgb;
raywenderlich.com
197
Metal by Tutorials
Chapter 8: Textures
This code multiplies the UV coordinates by 16 and accesses the texture outside of the allowable limits of 0 to 1. address::repeat changes the sampler’s addressing mode, so it’ll repeat the texture 16 times across the plane. The following image illustrates the other address sampling options shown with a tiling value of 3. You can use s_address or t_address to change only the width or height coordinates, respectively.
The sampler address mode ➤ Build and run your app.
Texture tiling The ground looks great! The house… not so much. The shader has tiled the house texture as well. To overcome this problem, you’ll create a tiling property on the model and send it to the fragment function with params. ➤ In Common.h, add this to Params: uint tiling;
raywenderlich.com
198
Metal by Tutorials
Chapter 8: Textures
➤ In Model.swift, create a new property in Model: var tiling: UInt32 = 1
➤ In render(encoder:uniforms:params:), just after var params = fragment, add this: params.tiling = tiling
➤ In Renderer.swift, replace the declaration of ground with: lazy var ground: Model = { var ground = Model(name: "plane.obj") ground.tiling = 16 return ground }()
You’re now sending the model’s tiling factor to the fragment function. ➤ Open Shaders.metal. In fragment_main, replace the declaration of baseColor with: float3 baseColor = baseColorTexture.sample( textureSampler, in.uv * params.tiling).rgb;
➤ Build and run the app, and you’ll see that both the ground and house now tile correctly.
Corrected tiling
raywenderlich.com
199
Metal by Tutorials
Chapter 8: Textures
Note: Creating a sampler in the shader is not the only option. You can create an MTLSamplerState, hold it with the model and send the sampler state to the fragment function with the [[sampler(n)]] attribute. As the scene rotates, you’ll notice some distracting noise. You’ve seen what happens on the grass when you oversample a texture. But, when you undersample a texture, you can get a rendering artifact known as moiré, which is occurring on the roof of the house.
A moiré example In addition, the noise at the horizon almost looks as if the grass is sparkling. You can solve these artifact issues by sampling correctly using resized textures called mipmaps.
Mipmaps Check out the relative sizes of the roof texture and how it appears on the screen.
Size of texture compared to on-screen viewing
raywenderlich.com
200
Metal by Tutorials
Chapter 8: Textures
The pattern occurs because you’re sampling more texels than you have pixels. The ideal would be to have the same number of texels to pixels, meaning that you’d require smaller and smaller textures the further away an object is. The solution is to use mipmaps. Mipmaps let the GPU compare the fragment on its depth texture and sample the texture at a suitable size. MIP stands for multum in parvo — a Latin phrase meaning “much in small”. Mipmaps are texture maps resized down by a power of 2 for each level, all the way down to 1 pixel in size. If you have a texture of 64 pixels by 64 pixels, then a complete mipmap set would consist of: Level 0: 64 x 64, 1: 32 x 32, 2: 16 x 16, 3: 8 x 8, 4: 4 x 4, 5: 2 x 2, 6: 1 x 1.
Mipmaps In the following image, the top checkered texture has no mipmaps. But in the bottom image, every fragment is sampled from the appropriate MIP level. As the checkers recede, there’s much less noise, and the image is cleaner. At the horizon, you can see the solid color smaller gray mipmaps.
Mipmap comparison
raywenderlich.com
201
Metal by Tutorials
Chapter 8: Textures
You can easily and automatically generate these mipmaps when first loading the texture. ➤ Open TextureController.swift. In loadTexture(filename:), change the texture loading options to: let textureLoaderOptions: [MTKTextureLoader.Option: Any] = [ .origin: MTKTextureLoader.Origin.bottomLeft, .SRGB: false, .generateMipmaps: NSNumber(value: true) ]
This code will create mipmaps all the way down to the smallest pixel. There’s one more thing to change: the sampler. ➤ Open Shaders.metal, and add the following code to the construction of textureSampler: mip_filter::linear
The default for mip_filter is none. However, if you provide either .linear or .nearest, then the GPU will sample the correct mipmap. ➤ Build and run the app.
Mipmaps added The noise from both the building and the ground is gone.
raywenderlich.com
202
Metal by Tutorials
Chapter 8: Textures
Using the Capture GPU workload tool, you can inspect the mipmaps. Choose the draw call, and double-click a texture. At the bottom-left, you can choose the MIP level. This is MIP level 4 on the house texture:
Mipmap level 4 example
Anisotropy Your rendered ground is looking a bit muddy and blurred in the background. This is due to anisotropy. Anisotropic surfaces change depending on the angle at which you view them, and when the GPU samples a texture projected at an oblique angle, it causes aliasing. ➤ In Shaders.metal, add this to the construction of textureSampler: max_anisotropy(8)
Metal will now take eight samples from the texel to construct the fragment. You can specify up to 16 samples to improve quality. Use as few as you can to obtain the quality you need because the sampling can slow down rendering.
raywenderlich.com
203
Metal by Tutorials
Chapter 8: Textures
Note: As mentioned before, you can hold an MTLSamplerState on Model. If you increase anisotropy sampling, you may not want it on all models, and this might be a good reason for creating the sampler state outside the fragment shader. ➤ Build and run, and your render should be artifact-free.
Anisotropy When you write your full game, you’re likely to have many textures for the different models. Some models are likely to have several textures. Organizing these textures and working out which ones need mipmaps can become labor-intensive. Plus, you’ll also want to compress images where you can and send textures of varying sizes and color gamuts to different devices. The asset catalog is where you’ll turn.
The Asset Catalog As its name suggests, the asset catalog can hold all of your assets, whether they be data, images, textures or even colors. You’ve probably used the catalog for app icons and images. Textures differ from images in that the GPU uses them, and thus they have different attributes in the catalog. To create textures, you add a new texture set to the asset catalog.
raywenderlich.com
204
Metal by Tutorials
Chapter 8: Textures
You’ll now replace the textures for the low poly house and ground and use textures from a catalog. ➤ Create a new file using the Asset Catalog template (found in the Resource section), and name it Textures. Remember to check both the iOS and macOS targets. ➤ With Textures.xcassets open, choose Editor ▸ Add New Asset ▸ AR and Textures ▸ Texture Set (or click the + at the bottom of the panel and choose AR and Textures ▸ Texture Set). ➤ Double-click the Texture name and rename it to grass. ➤ Open the Models ▸ Textures group and drag barn-ground.png to the Universal slot in your catalog. With the Attributes inspector open, click on the grass to see all of the texture options.
Texture options in the asset catalog Here, you can see that by default, all mipmaps are created automatically. If you change Mipmap Levels to Fixed, you can choose how many levels to make. If you don’t like the automatic mipmaps, you can replace them with your own custom mipmaps by dragging them to the correct slot.
raywenderlich.com
205
Metal by Tutorials
Chapter 8: Textures
Asset catalogs give you complete control of your textures without having to write cumbersome code, although you can still write the code using the MTLTextureDescriptor API if you want.
Mipmap slots Now that you’re using named textures from the asset catalog instead of .png files, you’ll need to change your texture loader. ➤ Open TextureController.swift, and at the top of loadTexture(filename:), after defining textureLoader, add this: if let texture = try? textureLoader.newTexture( name: filename, scaleFactor: 1.0, bundle: Bundle.main, options: nil) { print("loaded texture: \(filename)") return texture }
This now searches the bundle for the named texture and loads it if there is one. When loading from the asset catalog, the options that you set in the Attributes inspector take the place of most of the texture loading options, so these options are now nil. The last thing to do is to make sure the model points to the new texture. ➤ Open plane.mtl, located in Models ▸ Ground. If your file is not text-editable, you can right-click the file and choose Open in External Editor. ➤ Replace: map_Kd ground.png
➤ With: #map_Kd ground.png map_Kd grass
raywenderlich.com
206
Metal by Tutorials
Chapter 8: Textures
Here, you commented out the old texture and added the new one. The grass texture will now load from the asset catalog in place of the old one. ➤ Repeat this for the low poly house to change it into a barn: 1. Create a new texture set in the asset catalog and rename it barn. 2. Drag lowpoly-barn-color.png into the texture set from the Models ▸ Textures group. 3. Change the name of the diffuse texture in Models ▸ LowPolyHouse ▸ lowpolyhouse.mtl to barn. Note: Be careful to drop the images on the texture’s Universal slot. If you drag the images into the asset catalog, they are, by default, images and not textures. And you won’t be able to make mipmaps on images or change the pixel format. ➤ Build and run your app to see your new textures.
SRGB rendering You can see that the textures have reverted to the sRGB space because you’re now loading them in their original format. You can confirm this using the Capture GPU workload tool.
raywenderlich.com
207
Metal by Tutorials
Chapter 8: Textures
➤ Open Textures.xcassets, click on the barn texture, and in the Attributes inspector, change the Interpretation to Data:
Convert texture to data When your app loads the sRGB texture to a non-sRGB buffer, it automatically converts from sRGB space to linear space. (See Apple’s Metal Shading Language document for the conversion rule.) By accessing as data instead of colors, your shader can treat the color data as linear. You’ll also notice in the above image that the origin — unlike loading the .png texture manually — is Top Left. The asset catalog loads textures differently. ➤ Repeat for the grass texture. ➤ Build and run, and your colors should now be correct.
Corrected color space
The Right Texture for the Right Job Using asset catalogs gives you complete control over how to deliver your textures. Currently, you only have two color textures. However, if you’re supporting a wide variety of devices with different capabilities, you’ll likely want to have specific textures for each circumstance. On devices with less RAM, you’d want smaller graphics. raywenderlich.com
208
Metal by Tutorials
Chapter 8: Textures
For example, here is a list of individual textures you can assign by checking the different options in the Attributes inspector, for the Apple Watch, devices with 3GB and 4GB memory, and sRGB and P3 displays.
Custom textures in the asset catalog
Texture Compression In recent years, people have put much effort into compressing textures to save both CPU and GPU memory. There are various formats you can use, such as ETC and PVRTC. Apple has embraced ASTC as being the most high-quality compressed format. ASTC is available on the A8 chip and newer. Using texture sets within the asset catalog allows your app to determine for itself which is the best format to use. With your app running on macOS, take a look at how much memory it’s consuming.
raywenderlich.com
209
Metal by Tutorials
Chapter 8: Textures
➤ Click on the Debug navigator and select Memory.
App memory usage This is the usage after 45 seconds — your app’s memory consumption will increase for about five minutes and then stabilize. If you capture the frame with the Capture GPU Workload button, you’ll see that the texture format on the GPU is RGBA8Unorm. When you use asset catalogs, Apple will automatically determine the most appropriate format for your texture. ➤ In Textures.xcassets, select each of your textures, and in the Attributes inspector, change the Pixel Format from Automatic to ASTC 8×8 Compressed Red Green Blue Alpha. This is a highly compressed format. ➤ Build and run your app, and check the memory usage again. You’ll see that the memory footprint is slightly reduced. However, so is the quality of the render. For distant textures, this quality might be fine, but you have to balance memory usage with render quality.
Compressed texture comparison Note: You may have to test the app on an iOS device to see the change in texture format in the GPU Debugger. On iOS, the automatic format will be ASTC 4×4, which is indistinguishable from the png render.
raywenderlich.com
210
Metal by Tutorials
Chapter 8: Textures
Key Points • UVs, also known as texture coordinates, match vertices to the location in a texture. • During the modeling process, you flatten the model by marking seams. You can then paint on a texture that matches the flattened model map. • You can load textures using either the MTKTextureLoader or the asset catalog. • A model may be split into groups of vertices known as submeshes. Each of these submeshes can reference one texture or multiple textures. • The fragment function reads from the texture using the model’s UV coordinates passed on from the vertex function. • The sRGB color space is the default color gamut. Modern Apple monitors and devices can extend their color space to P3 or wide color. • Capture GPU workload is a useful debugging tool. Use it regularly to inspect what’s happening on the GPU. • Mipmaps are resized textures that match the fragment sampling. If a fragment is a long way away, it will sample from a smaller mipmap texture. • The asset catalog is a great place to store all of your textures. Later, you’ll have multiple textures per model, and it’s better to keep them all in one place. Customization for different devices is easy using the asset catalog. • Topics such as color and compression are huge. In the resources folder for this chapter, in references.markdown, you’ll find some recommended articles to read further.
raywenderlich.com
211
9
Chapter 9: Navigating a 3D Scene
A scene can consist of one or more cameras, lights and models. Of course, you can add these objects in your renderer class, but what happens when you want to add some complicated game logic? Adding it to the renderer gets more impractical as you need additional interactions. Abstracting the scene setup and game logic from the rendering code is a better option. Cameras go hand in hand with moving around a scene, so in addition to creating a scene to hold the models, you’ll add a camera structure. Ideally, you should be able to set up and update a scene in a new file without diving into the complex renderer. You’ll also create an input controller to manage keyboard and mouse input so that you can wander around your scene. The game engines will include features such as input controllers, physics engines and sound. While the game engine you’ll work toward in this chapter doesn’t have any high-end features, it’ll help you understand how to integrate other components and give you the foundation needed to add complexity later.
raywenderlich.com
212
Metal by Tutorials
Chapter 9: Navigating a 3D Scene
The Starter Project The starter project for this chapter is the same as the final project for the previous chapter.
Scenes A scene holds models, cameras and lighting. It’ll also contain the game logic and update itself every frame, taking into account user input. ➤ Open the starter project. Create a new Swift file called GameScene.swift and replace the code with: import MetalKit struct GameScene { }
If you created a structure named Scene rather than GameScene, there would be a conflict with the SwiftUI Scene you use in NavigationApp.swift. If you really want to use Scene, you can add the explicit namespace to Scene in NavigationApp.swift using SwiftUI.Scene. But it’s best to remember that Scenes belong to SwiftUI. ➤ Add this code to GameScene: lazy var house: Model = { Model(name: "lowpoly-house.obj") }() lazy var ground: Model = { var ground = Model(name: "plane.obj") ground.tiling = 16 ground.scale = 40 return ground }() lazy var models: [Model] = [ground, house]
The scene holds all of the models you need to render. ➤ Open Renderer.swift, and remove the instantiation of house and ground. ➤ Add a new property: lazy var scene = GameScene()
raywenderlich.com
213
Metal by Tutorials
Chapter 9: Navigating a 3D Scene
You still have some compile errors in draw(in:) because you removed house and ground. You’ll fix those shortly. At the moment, you rotate the models just before drawing them, but it’s a good idea to separate update and render. GameScene will update the models, and Renderer will render them. ➤ Open GameScene.swift and add a new update method to GameScene: mutating func update(deltaTime: Float) { ground.scale = 40 ground.rotation.y = sin(deltaTime) house.rotation.y = sin(deltaTime) }
Here, you perform the rotation and scaling, which are currently in Renderer. You’ll calculate deltaTime, which is the amount of time that has passed since the previous frame soon. You’ll pass this from Renderer to GameScene. ➤ In Renderer.swift, in draw(in:), replace everything between // update and render to // end update and render with: scene.update(deltaTime: timer) for model in scene.models { model.render( encoder: renderEncoder, uniforms: uniforms, params: params) }
➤ Build and run the app.
The initial scene raywenderlich.com
214
Metal by Tutorials
Chapter 9: Navigating a 3D Scene
Here, you reduce the complexity of draw(in:), separate the update from the render and set up the scene to handle its own updates. You can also more easily add and update models in GameScene.
Cameras Instead of creating view and projection matrices in the renderer, you can abstract the construction and calculation away rendering code to a Camera structure. Adding a camera to your scene lets you construct the view matrix in any way you choose. Currently, you rotate the scene by rotating both house and ground in the y axis. While it looks to the viewer as if a camera is rotating around the scene, in fact, the view matrix doesn’t change. Now you’ll explore how to move a camera around the scene with keyboard and mouse input. Setting up a camera is simply a way of calculating a view matrix. Miscalculating the view matrix is a frequent pain point where your carefully rendered objects result in a blank screen. So, it’s worth spending time working out common camera setups. ➤ Create a new Swift file named Camera.swift, and replace the existing code with: import CoreGraphics protocol Camera: Transformable { var projectionMatrix: float4x4 { get } var viewMatrix: float4x4 { get } mutating func update(size: CGSize) mutating func update(deltaTime: Float) }
Cameras have a position and rotation, so they should conform to Transformable. All cameras have a projection and view matrix as well as methods to perform when the window size changes and when each frame updates. ➤ Create a custom camera: struct FPCamera: Camera { var transform = Transform() }
You created a first-person camera. Eventually, this camera will move forward when you press the W key.
raywenderlich.com
215
Metal by Tutorials
Chapter 9: Navigating a 3D Scene
➤ Add this code to FPCamera: var aspect: Float = 1.0 var fov = Float(70).degreesToRadians var near: Float = 0.1 var far: Float = 100 var projectionMatrix: float4x4 { float4x4( projectionFov: fov, near: near, far: far, aspect: aspect) }
Currently, you set up the projection matrix in Renderer‘s mtkView(_:drawableSizeWillChange:). You’ll remove that code in Renderer shortly. ➤ Add a new method to update the camera’s aspect ratio: mutating func update(size: CGSize) { aspect = Float(size.width / size.height) }
➤ Add the camera’s view matrix calculation: var viewMatrix: float4x4 { (float4x4(rotation: rotation) * float4x4(translation: position)).inverse }
Each camera calculates its own projection and view matrix. You’ll change this view matrix in a moment. But first, you’ll run the app to see what this matrix does. ➤ Add the update method: mutating func update(deltaTime: Float) { }
This method repositions the camera every frame. You’ve now set up all the properties and methods required to conform to Camera. ➤ Open GameScene.swift, and add a new camera property: var camera = FPCamera()
raywenderlich.com
216
Metal by Tutorials
Chapter 9: Navigating a 3D Scene
➤ Then add an initializer to GameScene: init() { camera.position = [0, 1.5, -5] }
➤ Create a new method: mutating func update(size: CGSize) { camera.update(size: size) }
This code will update the camera when the screen size changes. ➤ Open Renderer.swift and replace the entire contents of mtkView(_:drawableSizeWillChange:) with: scene.update(size: size)
When the screen size changes, the scene update calls the camera update, which in turn updates the aspect ratio needed for the projection matrix. ➤ In draw(in:), after scene.update(deltaTime: timer), add: uniforms.viewMatrix = scene.camera.viewMatrix uniforms.projectionMatrix = scene.camera.projectionMatrix
Here, you set up the uniforms the vertex shader requires. ➤ Next, remove the previous viewMatrix assignment at the top of draw(in:): uniforms.viewMatrix = float4x4(translation: [0, 1.5, -5]).inverse
Now instead of rotating the ground and the house, you can easily rotate the camera. ➤ Open GameScene.swift. In update(deltaTime:), replace: ground.rotation.y = sin(deltaTime) house.rotation.y = sin(deltaTime)
With: camera.rotation.y = sin(deltaTime)
raywenderlich.com
217
Metal by Tutorials
Chapter 9: Navigating a 3D Scene
➤ Build and run the app.
The camera rotating Instead of rotating the house and ground, the camera now rotates around the world origin. If you update the view matrix, the vertex shader code updates the final transformation of all the models in the scene. However, you don’t want the camera to rotate around the world origin in a firstperson camera: You want it to rotate around its own origin. ➤ Open Camera.swift, and change viewMatrix in FPCamera to: var viewMatrix: float4x4 { (float4x4(translation: position) * float4x4(rotation: rotation)).inverse }
Here, you reverse the order of matrix multiplication so that the camera rotates around its own origin. ➤ Build and run the app.
The camera rotating around its center The camera now rotates in place. Next, you’ll set up keys to move around the scene. raywenderlich.com
218
Metal by Tutorials
Chapter 9: Navigating a 3D Scene
Input There are various forms of input, such as game controllers, keyboards, mice and trackpads. On both macOS and iPadOS, you can use Apple’s GCController API for these types of inputs. This API helps you set your code up for: • Events or Interrupts: Takes action when the user presses the key. You can set delegate methods or closures of code to run when an event occurs. • Polling: Processes all pressed keys on every frame. In this app, you’ll use polling to move your cameras and players. It’s a personal choice, and neither method is better than the other. The input code you’ll build works on macOS and iPadOS if you connect a keyboard and mouse to your iPad. If you want input on your iPhone or iPad without extra controllers, use GCVirtualController, which lets you configure on-screen controls that emulate a game controller. You can download Apple’s Supporting Game Controllers sample code (https://apple.co/3qfeL60) that demonstrates this. ➤ Create a new Swift file named InputController.swift, and replace the code with: import GameController class InputController { static let shared = InputController() }
This code creates a singleton input controller that you can access throughout your app. ➤ Add a new property to InputController: var keysPressed: Set = []
In this set, InputController keeps track of all keys currently pressed. To track the keyboard, you need to set up an observer. ➤ Add this initializer: private init() { let center = NotificationCenter.default center.addObserver( forName: .GCKeyboardDidConnect, object: nil,
raywenderlich.com
219
Metal by Tutorials
}
}
Chapter 9: Navigating a 3D Scene
queue: nil) { notification in let keyboard = notification.object as? GCKeyboard keyboard?.keyboardInput?.keyChangedHandler = { _, _, keyCode, pressed in if pressed { self.keysPressed.insert(keyCode) } else { self.keysPressed.remove(keyCode) } }
Here, you add an observer to set the keyChangedHandler when the keyboard first connects to the app. When the player presses or lifts a key, the keyChangedHandler code runs and either adds or removes the key from the set. Now test to see if it works. ➤ Open GameScene.swift, and add this to the end of update(deltaTime:): if InputController.shared.keysPressed.contains(.keyH) { print("H key pressed") }
➤ Build and run the app. Then press different keys. Press the H key in the console and you’ll see your print log.
Notice the annoying warning ‘beep’ from macOS when you press the keys. ➤ Open InputController.swift, and add this to the end of init(): #if os(macOS) NSEvent.addLocalMonitorForEvents( matching: [.keyUp, .keyDown]) { _ in nil } #endif
Here, on macOS only, you interrupt the view’s responder chain by handling any key presses and telling the system that it doesn’t need to take action when a key is pressed. You don’t need to do this for iPadOS, as the iPad doesn’t make the keyboard noise. raywenderlich.com
220
Metal by Tutorials
Chapter 9: Navigating a 3D Scene
Note: You could add keys to keysPressed in this code instead of using the observer. However, that wouldn’t work on iPadOS, and GCKeyCode is easier to read than the raw key values that NSEvent gives you. ➤ Build and rerun the app. Test pressing keys. Now the noise is gone. If you have a Bluetooth keyboard and an iPad device, connect the keyboard to the iPad and test that it also works on iPadOS. ➤ Open GameScene.swift, and remove the key testing code. You can now capture any pressed key. Soon, you’ll set up standard movement keys to move the camera.
Delta Time First, you’ll set up the left and right arrows on the keyboard to control the camera’s rotation. When considering movement, think about how much time has passed since the last movement occurred. In an ideal world, at 60 frames per second, a frame should take 0.01667 milliseconds to execute. However, some displays can produce 120 frames per second or even a variable refresh rate. If you get a choppy frame rate, you can smooth out the movement by calculating delta time, which is the amount of time since the previous execution of the code. ➤ Open Renderer.swift, and replace var timer: Float 0 with: var lastTime: Double = CFAbsoluteTimeGetCurrent() lastTime holds the time from the previous frame. You initialize it with the current
time. ➤ In draw(in:), remove: timer += 0.005
raywenderlich.com
221
Metal by Tutorials
Chapter 9: Navigating a 3D Scene
➤ Replace scene.update(deltaTime: timer) with: let currentTime = CFAbsoluteTimeGetCurrent() let deltaTime = Float(currentTime - lastTime) lastTime = currentTime scene.update(deltaTime: deltaTime)
Here, you get the current time and calculate the difference from the last time.
Camera Rotation ➤ Open GameScene.swift. In update(deltaTime:), replace: camera.rotation.y = sin(deltaTime)
With: camera.update(deltaTime: deltaTime)
➤ Create a new Swift file named Movement.swift, and add: enum Settings { static var rotationSpeed: Float { 2.0 } static var translationSpeed: Float { 3.0 } static var mouseScrollSensitivity: Float { 0.1 } static var mousePanSensitivity: Float { 0.008 } }
You can tweak these settings to make your camera and mouse movement smooth. Eventually, you might choose to replace Settings with a user interface that sets the values. • rotationSpeed: How many radians the camera should rotate in one second. You’ll calculate the amount the camera should rotate in delta time. • translationSpeed: The distance per second that your camera should travel. • mouseScrollSensitivity and mousePanSensitivity: Settings to adjust mouse tracking and scrolling.
raywenderlich.com
222
Metal by Tutorials
Chapter 9: Navigating a 3D Scene
➤ Add a new protocol: protocol Movement where Self: Transformable { }
Your game might move a player object instead of a camera, so make the movement code as flexible as possible. Now you can give any Transformable object Movement. ➤ Create an extension with a default method: extension Movement { func updateInput(deltaTime: Float) -> Transform { var transform = Transform() let rotationAmount = deltaTime * Settings.rotationSpeed let input = InputController.shared if input.keysPressed.contains(.leftArrow) { transform.rotation.y -= rotationAmount } if input.keysPressed.contains(.rightArrow) { transform.rotation.y += rotationAmount } return transform } }
You already told InputController to add and remove key presses to a Set called keysPressed. Here, you find out if keysPressed contains the arrow keys. If it does, you change the transform rotation value. ➤ Open Camera.swift and add the protocol conformance to FPCamera: extension FPCamera: Movement { }
➤ Still in Camera.swift, add this code to update(deltaTime:): let transform = updateInput(deltaTime: deltaTime) rotation += transform.rotation
You update the camera’s rotation with the transform calculated in Movement.
raywenderlich.com
223
Metal by Tutorials
Chapter 9: Navigating a 3D Scene
➤ Build and run the app. Now, use the arrow keys to rotate the camera.
Using arrow keys to rotate the camera
Camera Movement You can implement forward and backward movement the same way using standard WASD keys: • W: Move forward • A: Strafe left • S: Move backward • D: Strafe right Here’s what to expect when you move the camera through the scene: • You’ll move along the x and z axes. • Your camera will have a direction vector. When you press the W key, you’ll move along the z axis in a positive direction. • If you press the W and D keys simultaneously, you’ll move diagonally. • When you press the left and right arrow keys, you’ll rotate in that direction, which changes the camera’s forward direction vector.
raywenderlich.com
224
Metal by Tutorials
Chapter 9: Navigating a 3D Scene
➤ Open Movement.swift, and add a computed property to Movement’s extension: var forwardVector: float3 { normalize([sin(rotation.y), 0, cos(rotation.y)]) }
This is the forward vector based on the current rotation. The following image shows an example of forward vectors when rotation.y is 0º and 45º:
Forward vectors ➤ Add a computed property to handle strafing from side to side: var rightVector: float3 { [forwardVector.z, forwardVector.y, -forwardVector.x] }
This vector points 90º to the right of the forward vector. ➤ Still in Movement, at the end of updateInput(deltaTime:), before return, add: var direction: float3 = .zero if input.keysPressed.contains(.keyW) { direction.z += 1 } if input.keysPressed.contains(.keyS) { direction.z -= 1 } if input.keysPressed.contains(.keyA) { direction.x -= 1
raywenderlich.com
225
Metal by Tutorials
Chapter 9: Navigating a 3D Scene
} if input.keysPressed.contains(.keyD) { direction.x += 1 }
This code processes each depressed key and creates a final desired direction vector. For instance, if the game player presses W and A, she wants to go diagonally forward and left. The final direction vector is [-1, 0, 1]. ➤ After the previous code, add: let translationAmount = deltaTime * Settings.translationSpeed if direction != .zero { direction = normalize(direction) transform.position += (direction.z * forwardVector + direction.x * rightVector) * translationAmount }
Here, you calculate the transform’s position from its current forward and right vectors and the desired direction. ➤ Open Camera.swift, and add this code to the end of update(deltaTime:): position += transform.position
➤ Build and run the app. Now, you can move about your scene using the keyboard controls.
Moving around the scene using the keyboard
raywenderlich.com
226
Metal by Tutorials
Chapter 9: Navigating a 3D Scene
Mouse and Trackpad Input Players on macOS games generally use mouse or trackpad movement to look around the scene rather than arrow keys. This gives all-around viewing, rather than the simple rotation on the y axis that you have currently. ➤ Open InputController.swift, and add a new structure to InputController that you’ll use in place of CGPoint: struct Point { var x: Float var y: Float static let zero = Point(x: 0, y: 0) }
Make sure that Point goes inside InputController to avoid future name conflicts. Point is the same as CGPoint, except it contains Floats rather than CGFloats. ➤ Add these properties to InputController to record mouse movement: var leftMouseDown = false var mouseDelta = Point.zero var mouseScroll = Point.zero
• leftMouseDown: Tracks when the player does a left-click. • mouseDelta: The movement since the last tracked movement. • mouseScroll: Keeps track of how much the player scrolls with the mouse wheel. ➤ At the end of init(), add: center.addObserver( forName: .GCMouseDidConnect, object: nil, queue: nil) { notification in let mouse = notification.object as? GCMouse }
This code to connect the mouse is similar to how you handled the keyboard connection. ➤ Inside the closure, after setting mouse, add: // 1 mouse?.mouseInput?.leftButton.pressedChangedHandler = { _, _, pressed in
raywenderlich.com
227
Metal by Tutorials
Chapter 9: Navigating a 3D Scene
self.leftMouseDown = pressed } // 2 mouse?.mouseInput?.mouseMovedHandler = { _, deltaX, deltaY in self.mouseDelta = Point(x: deltaX, y: deltaY) } // 3 mouse?.mouseInput?.scroll.valueChangedHandler = { _, xValue, yValue in self.mouseScroll.x = xValue self.mouseScroll.y = yValue }
Here, you: 1. Record when the user holds down the left mouse button. 2. Track mouse movement. 3. Record scroll wheel movement. xValue and yValue are normalized values between -1 and 1. If you use a game controller instead of a mouse, the first parameter is dpad, which tells you which directional pad element changed. Now you’re ready to use these tracked mouse input values. Note: For iOS, you should use touch values rather than mouse tracking. The challenge project sample sets up MetalView with gestures that update InputController values.
Arcball Camera In many apps, the camera rotates about a particular point. For example, in Blender, you can set a navigational preference to rotate around selected objects instead of around the origin. ➤ Open Camera.swift. Copy and paste the FPCamera structure without the extension. ➤ Rename the copied structure: struct ArcballCamera: Camera {
raywenderlich.com
228
Metal by Tutorials
Chapter 9: Navigating a 3D Scene
➤ Since you’re not implementing Movement, remove all the code from update(deltaTime:). Your code will now compile. ➤ Open GameScene.swift and change the camera initialization to: var camera = ArcballCamera()
Now, you have a camera with no movement.
Orbiting a Point The camera needs a track to rotate about a point: • Target: The point the camera will orbit. • Distance: The distance between the camera and the target. The player controls this with the mouse wheel. • Rotation: The camera’s rotation about the point. The player controls this by leftclicking the mouse and dragging.
Orbiting a point
raywenderlich.com
229
Metal by Tutorials
Chapter 9: Navigating a 3D Scene
From these three properties, you can determine the camera’s world position and rotate it to always point at the target position. To do this: 1. First, rotate the distance vector around the y axis. 2. Then, rotate the distance vector around the x axis. 3. Add the target position to the rotated vector to get the new world position of the camera. 4. Rotate the camera to look at the target position. ➤ Open Camera.swift, and add the necessary properties to ArcballCamera to keep track of the camera’s orbit: let let var var
minDistance: Float = 0.0 maxDistance: Float = 20 target: float3 = [0, 0, 0] distance: Float = 2.5
You’ll constrain distance with minDistance and maxDistance. ➤ In update(deltaTime:), add: let input = InputController.shared let scrollSensitivity = Settings.mouseScrollSensitivity distance -= (input.mouseScroll.x + input.mouseScroll.y) * scrollSensitivity distance = min(maxDistance, distance) distance = max(minDistance, distance) input.mouseScroll = .zero
Here, you change distance depending on the mouse scroll values. ➤ Continue with this code: if input.leftMouseDown { let sensitivity = Settings.mousePanSensitivity rotation.x += input.mouseDelta.y * sensitivity rotation.y += input.mouseDelta.x * sensitivity rotation.x = max(-.pi / 2, min(rotation.x, .pi / 2)) input.mouseDelta = .zero }
raywenderlich.com
230
Metal by Tutorials
Chapter 9: Navigating a 3D Scene
If the player drags with the left mouse button, update the camera’s rotation values. ➤ Add this after the previous code: let rotateMatrix = float4x4( rotationYXZ: [-rotation.x, rotation.y, 0]) let distanceVector = float4(0, 0, -distance, 0) let rotatedVector = rotateMatrix * distanceVector position = target + rotatedVector.xyz
Here, you complete the calculations to rotate the distance vector and add the target position to the new vector. In MathLibrary.swift, float4x4(rotationYXZ:) creates a matrix using rotations in Y / X / Z order.
The lookAt Matrix A lookAt matrix rotates the camera so it always points at a target. In MathLibrary.swift, you’ll find a float4x4 initialization init(eye:center:up:). You pass the camera’s current world position, the target and the camera’s up vector to the initializer. In this app, the camera’s up vector is always [0, 1, 0]. ➤ In ArcballCamera, replace viewMatrix with: var viewMatrix: float4x4 { let matrix: float4x4 if target == position { matrix = (float4x4(translation: target) * float4x4(rotationYXZ: rotation)).inverse } else { matrix = float4x4(eye: position, center: target, up: [0, 1, 0]) } return matrix }
If the position is the same as the target, you simply rotate the camera to look around the scene at the target position. Otherwise, you rotate the camera with the lookAt matrix. ➤ Open GameScene.swift, and add this to the end of init(): camera.distance = length(camera.position) camera.target = [0, 1.2, 0]
With an arcball camera, you set the target and distance as well as the position.
raywenderlich.com
231
Metal by Tutorials
Chapter 9: Navigating a 3D Scene
➤ Build and run the app. Zoom in to get a view of the inside of the barn.
Inside the barn Click and drag to orbit the barn. In Movement.swift, change Settings to suit your tracking preferences.
Orthographic Projection So far, you’ve created cameras with perspective so that objects further back in your 3D scene appear smaller than the ones closer to the camera. Orthographic projection flattens three dimensions to two dimensions without any perspective distortion.
Orthographic projection
raywenderlich.com
232
Metal by Tutorials
Chapter 9: Navigating a 3D Scene
Sometimes it’s a little difficult to see what’s happening in a large scene. To help with that, you’ll build a top-down camera that shows the whole scene without any perspective distortion. ➤ Open Camera.swift, and add a new camera: struct OrthographicCamera: Camera, Movement { var transform = Transform() var aspect: CGFloat = 1 var viewSize: CGFloat = 10 var near: Float = 0.1 var far: Float = 100
}
var viewMatrix: float4x4 { (float4x4(translation: position) * float4x4(rotation: rotation)).inverse }
aspect is the ratio of the window’s width to height. viewSize is the unit size of the
scene. You’ll calculate the projection frustum in the shape of a box. ➤ Add the projection matrix code: var projectionMatrix: float4x4 { let rect = CGRect( x: -viewSize * aspect * 0.5, y: viewSize * 0.5, width: viewSize * aspect, height: viewSize) return float4x4(orthographic: rect, near: near, far: far) }
raywenderlich.com
233
Metal by Tutorials
Chapter 9: Navigating a 3D Scene
Here, you calculate a rectangle for the front of the frustum using the view size and aspect ratio. Then you call the orthographic initializer defined in MathLibrary.swift with the frustum’s near and far values.
The orthographic projection frustum ➤ Add the method to update the camera when the view size changes: mutating func update(size: CGSize) { aspect = size.width / size.height }
This code sets the aspect ratio for the orthographic projection matrix.
raywenderlich.com
234
Metal by Tutorials
Chapter 9: Navigating a 3D Scene
➤ Add the frame update method: mutating func update(deltaTime: Float) { let transform = updateInput(deltaTime: deltaTime) position += transform.position let input = InputController.shared let zoom = input.mouseScroll.x + input.mouseScroll.y viewSize -= CGFloat(zoom) input.mouseScroll = .zero }
Here, you use the previous Movement code to move around the scene using the WASD keys. You don’t need rotation, as you’re going to position the camera to be top-down. You use the mouse scroll to change the view size, which allows you to zoom in and out of the scene. ➤ Open GameScene.swift and change camera to: var camera = OrthographicCamera()
➤ Replace the contents of init() with: camera.position = [3, 2, 0] camera.rotation.y = -.pi / 2
➤ Build and run the app.
Orthographic viewing from the front
raywenderlich.com
235
Metal by Tutorials
Chapter 9: Navigating a 3D Scene
You’ll see a scene with no perspective. You might be surprised by this result. You can’t see the ground because there’s no field of view, which means you effectively “see” the ground plane side-on. Replace the previous code in init() with: camera.position = [0, 2, 0] camera.rotation.x = .pi / 2
This code places the camera in a top-down rotation. ➤ Build and run the app. You can still move the WASD movement keys to move about the scene and use the mouse scroll to zoom out.
Orthographic viewing from the top You’ll often use an orthographic camera when creating 2D games that look down on an entire board. Later, you’ll also use an orthographic camera when implementing shadows from directional lights.
raywenderlich.com
236
Metal by Tutorials
Chapter 9: Navigating a 3D Scene
Challenge For your challenge, combine FPCamera and ArcballCamera into one PlayerCamera. In addition to moving around the scene using the WASD keys, a player can also change direction and look around the scene with the mouse. To achieve this: • Copy FPCamera to PlayerCamera. This sets position and rotation. • Copy the left mouse down code from ArcballCamera’s update(deltaTime:) to the end of PlayerCamera‘s update(deltaTime:). This sets rotation when the mouse is used. PlayerCamera won’t use the scroll wheel for zooming. • The view matrix should use rotation without the z axis because you always travel on the xy plane. So, replace PlayerCamera’s viewMatrix with: var viewMatrix: float4x4 { let rotateMatrix = float4x4( rotationYXZ: [-rotation.x, rotation.y, 0]) return (float4x4(translation: position) * rotateMatrix).inverse }
• Change the camera in GameScene and set its initial position. When you finish, wander around your scene with your left hand on the keyboard to control forward motion and your right hand on the mouse to control direction. The left and right arrow keys will work for rotation, too.
Moving around the scene
raywenderlich.com
237
Metal by Tutorials
Chapter 9: Navigating a 3D Scene
Key Points • Scenes abstract game code and scene setup away from the rendering code. • Camera structures let you calculate the view and projection matrices separately from rendering the models. • On macOS and iPadOS, use Apple’s GCController API to process input from game controllers, keyboards and mice. • On iOS, GCVirtualController gives you onscreen D-pad controls. • For a first-person camera, calculate position and rotation from the player’s perspective. • An arcball camera orbits a target point. • An orthographic camera renders without perspective so that all vertices rendered to the 2D screen appear at the same distance from the camera.
raywenderlich.com
238
10
Chapter 10: Lighting Fundamentals
Light and shade are important requirements for making your scenes pop. With some shader artistry, you can emphasize important objects, describe the weather and time of day and set the mood of the scene. Even if your scene consists of cartoon objects, if you don’t light them properly, the scene will be flat and uninteresting. One of the simplest methods of lighting is the Phong reflection model. It’s named after Bui Tong Phong who published a paper in 1975 extending older lighting models. The idea is not to attempt duplication of light and reflection physics but to generate pictures that look realistic. This model has been popular for over 40 years and is a great place to start learning how to fake lighting using a few lines of code. All computer images are fake, but there are more modern real-time rendering methods that model the physics of light. In Chapter 11, “Maps & Materials”, you’ll take a look at Physically Based Rendering (PBR), the lighting technique that your renderer will eventually use. PBR is a more realistic lighting model, but Phong is easy to understand and get started with.
raywenderlich.com
239
Metal by Tutorials
Chapter 10: Lighting Fundamentals
The Starter Project ➤ Open the starter project for this chapter. The starter project’s files are now in sensible groups. In the Game group, the project contains a new game controller class which further separates scene updates and rendering. Renderer is now independent from GameScene. GameController initializes and owns both Renderer and GameScene. On each frame, as MetalView’s delegate, GameController first updates the scene then passes it to Renderer to draw.
Object ownership In GameScene.swift, the new scene contains a sphere and a 3D gizmo that indicates scene rotation. DebugLights.swift in the Utility group contains some code that you’ll use later for debugging where lights are located. Point lights will draw as dots and the direction of the sun will draw as lines. In the Geometry group, the default vertex descriptor in VertexDescriptor.swift now includes a color buffer. The sphere model has a texture to show colors, but the 3D gizmo uses vertex colors. In the Shaders group, the vertex shader forwards this color to the fragment shader, and the fragment shader uses the vertex color if there is no color texture.
raywenderlich.com
240
Metal by Tutorials
Chapter 10: Lighting Fundamentals
➤ Familiarize yourself with the code and build and run the project.
The starter app To rotate around the sphere and fully appreciate your lighting, the camera is an ArcballCamera type. Press 1 to set the camera to a front view, and 2 to reset the camera to the default view. GameScene contains the key pressing code for this. You can see that the sphere colors are very flat. In this chapter, you’ll add shading and specular highlights.
Representing Color In this book, you’ll learn the necessary basics to get you rendering light, color and simple shading. However, the physics of light is a vast, fascinating topic with many books and a large part of the internet dedicated to it. You can find further reading in references.markdown in the resources directory for this chapter. In the real world, the reflection of different wavelengths of light is what gives an object its color. A surface that absorbs all light is black. Inside the computer world, pixels display color. The more pixels, the better the resolution and this makes the resulting image clearer. Each pixel is made up of subpixels. These are a predetermined single color, either red, green or blue. By turning on and off these subpixels, depending on the color depth, the screen can display most of the colors visible to the human eye.
raywenderlich.com
241
Metal by Tutorials
Chapter 10: Lighting Fundamentals
In Swift, you can represent a color using the RGB values for that pixel. For example, float3(1, 0, 0) is a red pixel, float3(0, 0, 0) is black and float3(1, 1, 1) is white. From a shading point of view, you can combine a red surface with a gray light by multiplying the two values together: let result = float3(1.0, 0.0, 0.0) * float3(0.5, 0.5, 0.5)
The result is (0.5, 0, 0), which is a darker shade of red.
Color shading For simple Phong lighting, you can use the slope of the surface. The more the surface slopes away from a light source, the darker the surface becomes.
A 3D shaded sphere
raywenderlich.com
242
Metal by Tutorials
Chapter 10: Lighting Fundamentals
Normals The slope of a surface can determine how much a surface reflects light. In the following diagram, point A is facing straight toward the sun and will receive the most amount of light; point B is facing slightly away but will still receive some light; point C is facing entirely away from the sun and shouldn’t receive any of the light.
Surface normals on a sphere Note: In the real world, light bounces from surface to surface; if there’s any light in the room, there will be some reflection from objects that gently lights the back surfaces of all the other objects. This is global illumination. The Phong lighting model lights each object individually and is called local illumination. The dotted lines in the diagram are tangent to the surface. A tangent line is a straight line that best describes the slope of the curve at a point. The lines coming out of the circle are at right angles to the tangent lines. These are called surface normals, and you first encountered these in Chapter 7, “The Fragment Function”.
raywenderlich.com
243
Metal by Tutorials
Chapter 10: Lighting Fundamentals
Light Types There are several standard light options in computer graphics, each of which has their origin in the real world. • Directional Light: Sends light rays in a single direction. The sun is a directional light. • Point Light: Sends light rays in all directions like a light bulb. • Spotlight: Sends light rays in limited directions defined by a cone. A flashlight or a desk lamp would be a spotlight.
Directional Light A scene can have many lights. In fact, in studio photography, it would be highly unusual to have just a single light. By putting lights into a scene, you control where shadows fall and the level of darkness. You’ll add several lights to your scene through the chapter. The first light you’ll create is the sun. The sun is a point light that puts out light in all directions, but for computer modeling, you can consider it a directional light. It’s a powerful light source a long way away. By the time the light rays reach the earth, the rays appear to be parallel. Check this outside on a sunny day — everything you can see has its shadow going in the same direction.
The direction of sunlight
raywenderlich.com
244
Metal by Tutorials
Chapter 10: Lighting Fundamentals
To define the light types, you’ll create a Light structure that both the GPU and the CPU can read, and a SceneLighting structure that will describe the lighting for GameScene. ➤ In the Shaders group, open Common.h, and before #endif, create an enumeration of the light types you’ll be using: typedef enum { unused = 0, Sun = 1, Spot = 2, Point = 3, Ambient = 4 } LightType;
➤ Under this, add the structure that defines a light: typedef struct { LightType type; vector_float3 position; vector_float3 color; vector_float3 specularColor; float radius; vector_float3 attenuation; float coneAngle; vector_float3 coneDirection; float coneAttenuation; } Light;
This structure holds the position and color of the light. You’ll learn about the other properties as you go through the chapter. ➤ Create a new Swift file in the Game group, and name it SceneLighting.swift. Then, add this: struct SceneLighting { static func buildDefaultLight() -> Light { var light = Light() light.position = [0, 0, 0] light.color = [1, 1, 1] light.specularColor = [0.6, 0.6, 0.6] light.attenuation = [1, 0, 0] light.type = Sun return light } }
This file will hold the lighting for GameScene. You’ll have several lights, and buildDefaultLight() will create a basic light. raywenderlich.com
245
Metal by Tutorials
Chapter 10: Lighting Fundamentals
➤ Create a property in SceneLighting for a sun directional light: let sunlight: Light = { var light = Self.buildDefaultLight() light.position = [1, 2, -2] return light }() position is in world space. This will place a light to the right of the scene, and
forward of the sphere. The sphere is placed at the world’s origin. ➤ Create an array to hold the various lights you’ll be creating shortly: var lights: [Light] = []
➤ Add the initializer: init() { lights.append(sunlight) }
You’ll add all your lights for the scene in the initializer. ➤ Open GameScene.swift, and add the lighting property to GameScene: let lighting = SceneLighting()
You’ll do all the light shading in the fragment function so you’ll need to pass the array of lights to that function. Metal Shading Language doesn’t have a dynamic array feature, and there is no way to find out the number of items in an array. You’ll pass this value to the fragment shader in Params. ➤ Open Common.h, and add these properties to Params: uint lightCount; vector_float3 cameraPosition;
You’ll need the camera position property later. While you’re in Common.h, add a new index to BufferIndices: LightBuffer = 13
raywenderlich.com
246
Metal by Tutorials
Chapter 10: Lighting Fundamentals
You’ll use this to send lighting details to the fragment function. ➤ Open Renderer.swift, and add this to updateUniforms(scene:): params.lightCount = UInt32(scene.lighting.lights.count)
You’ll be able to access this value in the fragment shader function. ➤ In draw(scene:in:), just before for model in scene.models, add this: var lights = scene.lighting.lights renderEncoder.setFragmentBytes( &lights, length: MemoryLayout.stride * lights.count, index: LightBuffer.index)
Here, you send the array of lights to the fragment function in buffer index 13. You’ve now set up a sun light on the Swift side. You’ll do all the actual light calculations in the fragment function, and you’ll find out more about light properties.
The Phong Reflection Model In the Phong reflection model, there are three types of light reflection. You’ll calculate each of these, and then add them up to produce a final color.
Diffuse shading and micro-facets
raywenderlich.com
247
Metal by Tutorials
Chapter 10: Lighting Fundamentals
• Diffuse: In theory, light coming at a surface bounces off at an angle reflected about the surface normal at that point. However, surfaces are microscopically rough, so light bounces off in all directions as the picture above indicates. This produces a diffuse color where the light intensity is proportional to the angle between the incoming light and the surface normal. In computer graphics, this model is called Lambertian reflectance named after Johann Heinrich Lambert who died in 1777. In the real world, this diffuse reflection is generally true of dull, rough surfaces, but the surface with the most Lambertian property is humanmade: Spectralon (https://en.wikipedia.org/wiki/Spectralon), which is used for optical components. • Specular: The smoother the surface, the shinier it is, and the light bounces off the surface in fewer directions. A mirror completely reflects off the surface normal without deflection. Shiny objects produce a visible specular highlight, and rendering specular lighting can give your viewers hints about what sort of surface an object is — whether a car is an old wreck or fresh off the sales lot. • Ambient: In the real-world, light bounces around all over the place, so a shadowed object is rarely entirely black. This is the ambient reflection. A surface color is made up of an emissive surface color plus contributions from ambient, diffuse and specular. For diffuse and specular, to find out how much light the surface should receive at a particular point, all you have to do is find out the angle between the incoming light direction and the surface normal.
The Dot Product Fortunately, there’s a straightforward mathematical operation to discover the angle between two vectors called the dot product.
And:
Where ||A|| means the length (or magnitude) of vector A. Even more fortunately, both simd and Metal Shading Language have a function dot() to get the dot product, so you don’t have to remember the formulas. As well as finding out the angle between two vectors, you can use the dot product for checking whether two vectors are pointing in the same direction. raywenderlich.com
248
Metal by Tutorials
Chapter 10: Lighting Fundamentals
Resize the two vectors into unit vectors — that’s vectors with a length of 1. You can do this using the normalize() function. If the unit vectors are parallel with the same direction, the dot product result will be 1. If they are parallel but opposite directions, the result will be -1. If they are at right angles (orthogonal), the result will be 0.
The dot product Looking at the previous diagram, if the yellow (sun) vector is pointing straight down, and the blue (normal) vector is pointing straight up, the dot product will be -1. This value is the cosine angle between the two vectors. The great thing about cosines is that they are always values between -1 and 1 so you can use this range to determine how bright the light should be at a certain point. Take the following example:
The dot product of sunlight and normal vectors The sun is pouring down from the sky with a direction vector of [2, -2, 0]. Vector A is a normal vector of [-2, 2, 0]. The two vectors are pointing in opposite directions, so when you turn the vectors into unit vectors (normalize them), the dot product of them will be -1. Vector B is a normal vector of [0.3, 2, 0]. Sunlight is a directional light, so uses the same direction vector. Sunlight and B when normalized have a dot product of -0.59.
raywenderlich.com
249
Metal by Tutorials
Chapter 10: Lighting Fundamentals
This playground code demonstrates the calculations.
Dot product playground code Note: The result after line 8 shows that you should always be careful when using floating points, as results are never exact. Never use an expression such as if (x == 1.0) - always check =. In the fragment shader, you’ll be able to take these values and multiply the fragment color by the dot product to get the brightness of the fragment.
Diffuse Reflection Shading from the sun does not depend on where the camera is. When you rotate the scene, you’re rotating the world, including the sun. The sun’s position will be in world space, and you’ll put the model’s normals into the same world space to be able to calculate the dot product against the sunlight direction. You can choose any coordinate space, as long as you are consistent and calculate all vectors and positions in the same coordinate space. To be able to assess the slope of the surface in the fragment function, you’ll reposition the normals in the vertex function in much the same way as you repositioned the vertex position earlier. You’ll add the normals to the vertex descriptor so that the vertex function can process them.
raywenderlich.com
250
Metal by Tutorials
Chapter 10: Lighting Fundamentals
➤ Open Shaders.metal, and add these properties to VertexOut: float3 worldPosition; float3 worldNormal;
These will hold the vertex position and vertex normal in world space. Calculating the new position of normals is a bit different from the vertex position calculation. MathLibrary.swift contains a matrix method to create a normal matrix from another matrix. This normal matrix is a 3×3 matrix, because firstly, you’ll do lighting in world space which doesn’t need projection, and secondly, translating an object does not affect the slope of the normals. Therefore, you don’t need the fourth W dimension. However, if you scale an object in one direction (non-linearly), then the normals of the object are no longer orthogonal and this approach won’t work. As long as you decide that your engine does not allow non-linear scaling, then you can use the upper-left 3×3 portion of the model matrix, and that’s what you’ll do here. ➤ Open Common.h and add this matrix property to Uniforms: matrix_float3x3 normalMatrix;
This will hold the normal matrix in world space. ➤ In the Geometry group, open Model.swift, and in render(encoder:uniforms:params:), add this after setting uniforms.modelMatrix: uniforms.normalMatrix = uniforms.modelMatrix.upperLeft
This creates the normal matrix from the model matrix. ➤ Open Shaders.metal, and in vertex_main, when defining out, populate the VertexOut properties: .worldPosition = (uniforms.modelMatrix * in.position).xyz, .worldNormal = uniforms.normalMatrix * in.normal
Here, you convert the vertex position and normal to world space. Earlier in the chapter, you sent Renderer’s lights array to the fragment function in the LightBuffer index, but you haven’t yet changed the fragment function to receive the array.
raywenderlich.com
251
Metal by Tutorials
Chapter 10: Lighting Fundamentals
➤ Add this to fragment_main’s parameter list: constant Light *lights [[buffer(LightBuffer)]],
Creating Shared Functions in C++ Often you’ll want to access C++ functions from multiple files. Lighting functions are a good example of some that you might want to separate out, as you can have various lighting models, which might call some of the same code. To call a function from multiple .metal files: 1. Set up a header file with the name of the functions that you’re going to create. 2. Create a new .metal file and import the header, and also the bridging header file Common.h if you’re going to use a structure from that file. 3. Create the lighting functions in this new file. 4. In your existing .metal file, import the new header file and use the lighting functions. In the Shaders group, create a new Header File called Lighting.h. Don’t add it to any target. ➤ Add this before #endif /* Lighting_h */: #import "Common.h" float3 phongLighting( float3 normal, float3 position, constant Params ¶ms, constant Light *lights, float3 baseColor);
Here, you define a C++ function that will return a float3. In the Shaders group, create a new Metal File called Lighting.metal. Add it to both iOS and macOS targets. ➤ Add this new function header: #import "Lighting.h" float3 phongLighting(
raywenderlich.com
252
Metal by Tutorials
}
Chapter 10: Lighting Fundamentals
float3 normal, float3 position, constant Params ¶ms, constant Light *lights, float3 baseColor) { return float3(0);
You create a new function that returns a zero float3 value. You’ll build up code in phongLighting to calculate this final lighting value. ➤ Open Shaders.metal, and replace #import "Common.h" with: #import "Lighting.h"
Now you’ll be able to use phongLighting within this file. ➤ In fragment_main, replace return float4(baseColor, 1); with this: float3 normalDirection = normalize(in.worldNormal); float3 color = phongLighting( normalDirection, in.worldPosition, params, lights, baseColor ); return float4(color, 1);
Here, you make the world normal a unit vector, and call the new lighting function with the necessary parameters. If you build and run the app now, your models will render in black, as that’s the color that you’re currently returning from phongLighting.
No lighting raywenderlich.com
253
Metal by Tutorials
Chapter 10: Lighting Fundamentals
➤ Open Lighting.metal, and replace return float3(0); with: float3 diffuseColor = 0; float3 ambientColor = 0; float3 specularColor = 0; for (uint i = 0; i < params.lightCount; i++) { Light light = lights[i]; switch (light.type) { case Sun: { break; } case Point: { break; } case Spot: { break; } case Ambient: { break; } case unused: { break; } } } return diffuseColor + specularColor + ambientColor;
This sets up the outline for all the lighting calculations you’ll do. You’ll accumulate the final fragment color, made up of diffuse, specular and ambient contributions. ➤ Above break in case Sun, add this: // 1 float3 lightDirection = normalize(-light.position); // 2 float diffuseIntensity = saturate(-dot(lightDirection, normal)); // 3 diffuseColor += light.color * baseColor * diffuseIntensity;
Going through this code: 1. You make the light’s direction a unit vector. 2. You calculate the dot product of the two vectors. When the fragment fully points toward the light, the dot product will be -1. It’s easier for further calculation to make this value positive, so you negate the dot product. saturate makes sure the value is between 0 and 1 by clamping the negative numbers. This gives you the slope of the surface, and therefore the intensity of the diffuse factor.
raywenderlich.com
254
Metal by Tutorials
Chapter 10: Lighting Fundamentals
3. Multiply the base color by the diffuse intensity to get the diffuse shading. If you have several sun lights, diffuseColor will accumulate the diffuse shading. ➤ Build and run the app.
Diffuse shading You can sanity-check your results by returning your intermediate calculations from phongLighting. The following image shows normal and diffuseIntensity from the front view.
Visualizing the normal and diffuse intensity Note: To get the front view in your app, press “1” above the alpha keys while running it. “2” will reset to the default view. DebugLights.swift and DebugLights.metal in the Utility group, have some debugging methods so that you can visualize where your lights are. raywenderlich.com
255
Metal by Tutorials
Chapter 10: Lighting Fundamentals
➤ Open DebugLights.swift, and remove /* and */ at the top and bottom of the file. Before you added code in this chapter, this file would not compile, but does now. ➤ Open Renderer.swift, and toward the end of draw(scene:in:), before renderEncoder.endEncoding(), add this: DebugLights.draw( lights: scene.lighting.lights, encoder: renderEncoder, uniforms: uniforms)
This code will display lines to visualize the direction of the sun light. ➤ Build and run the app.
Debugging sunlight direction The red lines show the parallel sun light direction vector. As you rotate the scene, you can see that the brightest parts are the ones facing towards the sun. Note: the debug method uses .line as the rendering type. Unfortunately line width is not configurable on the GPU, so the lines may disappear at certain angles when they are too thin to render. This shading is pleasing, but not accurate. Take a look at the back of the sphere. The back of the sphere is black; however, you can see that the top of the green surround is bright green because it’s facing up. In the real-world, the surround would be blocked by the sphere and so be in the shade. However, you’re currently not taking occlusion into account, and you won’t be until you master shadows in Chapter 13, “Shadows”.
raywenderlich.com
256
Metal by Tutorials
Chapter 10: Lighting Fundamentals
Ambient Reflection In the real-world, colors are rarely pure black. There’s light bouncing about all over the place. To simulate this, you can use ambient lighting. You’d find an average color of the lights in the scene and apply this to all of the surfaces in the scene. ➤ Open SceneLighting.swift, and add an ambient light property: let ambientLight: Light = { var light = Self.buildDefaultLight() light.color = [0.05, 0.1, 0] light.type = Ambient return light }()
This light is a slightly green tint. ➤ Add this to the end of init(): lights.append(ambientLight)
➤ Open Lighting.metal, and above break in case Ambient, add this: ambientColor += light.color;
➤ Build and run the app. The scene is now tinged green as if there is a green light being bounced around. Change light.intensity in SceneLighting if you want less pronounced ambient light. This image has an ambient color of [0.05, 0.2, 0]:
Ambient lighting
raywenderlich.com
257
Metal by Tutorials
Chapter 10: Lighting Fundamentals
Specular Reflection Last but not least in the Phong reflection model, is the specular reflection. You now have a chance to put a coat of shiny varnish on the sphere. The specular highlight depends upon the position of the observer. If you pass a shiny car, you’ll only see the highlight at certain angles.
Specular reflection The light comes in (L) and is reflected (R) about the normal (N). If the viewer (V) is within a particular cone around the reflection (R), then the viewer will see the specular highlight. That cone is an exponential shininess parameter. The shinier the surface is, the smaller and more intense the specular highlight. In your case, the viewer is your camera so you’ll need to pass the camera coordinates, again in world position, to the fragment function. Earlier, you set up a cameraPosition property in params, and this is what you’ll use to pass the camera position. ➤ Open Renderer.swift, and in updateUniforms(scene:), add this: params.cameraPosition = scene.camera.position scene.camera.position is already in world space, and you’re already passing params to the fragment function, so you don’t need to take further action here.
raywenderlich.com
258
Metal by Tutorials
Chapter 10: Lighting Fundamentals
➤ Open Lighting.metal, and in phongLighting, add the following variables to the top of the function: float materialShininess = 32; float3 materialSpecularColor = float3(1, 1, 1);
These hold the surface material properties of a shininess factor and the specular color. As these are surface properties, you should be getting these values from each model’s materials, and you’ll do that in the following chapter. ➤ Above break in case Sun, add the following: if (diffuseIntensity > 0) { // 1 (R) float3 reflection = reflect(lightDirection, normal); // 2 (V) float3 viewDirection = normalize(params.cameraPosition); // 3 float specularIntensity = pow(saturate(dot(reflection, viewDirection)), materialShininess); specularColor += light.specularColor * materialSpecularColor * specularIntensity; }
Going through this code: 1. For the calculation of the specular color, you’ll need (L)ight, (R)eflection, (N)ormal and (V)iew. You already have (L) and (N), so here you use the Metal Shading Language function reflect to get (R). 2. You need the view vector between the fragment and the camera for (V). 3. Now you calculate the specular intensity. You find the angle between the reflection and the view using the dot product, clamp the result between 0 and 1 using saturate, and raise the result to a shininess power using pow. You then use this intensity to work out the specular color for the fragment.
raywenderlich.com
259
Metal by Tutorials
Chapter 10: Lighting Fundamentals
➤ Build and run the app to see your completed lighting.
Specular reflection Experiment with changing materialShininess from 2 to 1600. In Chapter 11, “Maps & Materials”, you’ll find out how to read in material and texture properties from the model to change its color and lighting. You’ve created a realistic enough lighting situation for a sun. You can add more variety to your scene with point and spot lights.
Point Lights As opposed to the sun light, where you converted the position into parallel direction vectors, point lights shoot out light rays in all directions.
Point light direction raywenderlich.com
260
Metal by Tutorials
Chapter 10: Lighting Fundamentals
A light bulb will only light an area of a certain radius, beyond which everything is dark. So you’ll also specify attenuation where a ray of light doesn’t travel infinitely far.
Point light attenuation Light attenuation can occur abruptly or gradually. The original OpenGL formula for attenuation is:
Where x is the constant attenuation factor, y is the linear attenuation factor and z is the quadratic attenuation factor. The formula gives a curved fall-off. You’ll represent xyz with a float3. No attenuation at all will be float3(1, 0, 0) — substituting x, y and z into the formula results in a value of 1. ➤ Open SceneLighting.swift, and add a point light property to SceneLighting: let redLight: Light = { var light = Self.buildDefaultLight() light.type = Point light.position = [-0.8, 0.76, -0.18] light.color = [1, 0, 0] light.attenuation = [0.5, 2, 1] return light }()
raywenderlich.com
261
Metal by Tutorials
Chapter 10: Lighting Fundamentals
Here, you create a red point light with a position and attenuation. You can experiment with the attenuation values to change radius and fall-off. ➤ Add the light to lights in init(): lights.append(redLight)
➤ Build and run the app.
Debugging a point light You’ll see a small red dot which marks the position of the point light rendered by DebugLights. Note: The shader for the point light debug dot is worth looking at. In DebugLights.metal, in fragment_debug_point, the default square point is turned into a circle by discarding fragments greater than a certain radius from the center of the point. The debug lights function shows you where the point light is, but it does not produce any light yet. You’ll do this in the fragment shader. ➤ Open Lighting.metal, and in phongLighting, add this above break in case Point: // 1 float d = distance(light.position, position); // 2 float3 lightDirection = normalize(light.position - position); // 3 float attenuation = 1.0 / (light.attenuation.x + light.attenuation.y * d + light.attenuation.z * d * d);
raywenderlich.com
262
Metal by Tutorials
Chapter 10: Lighting Fundamentals
float diffuseIntensity = saturate(dot(lightDirection, normal)); float3 color = light.color * baseColor * diffuseIntensity; // 4 color *= attenuation; diffuseColor += color;
Going through this code: 1. You find out the distance between the light and the fragment position. 2. With the directional sun light, you used the position as the light direction. Here, you calculate the direction from the fragment position to the light position. 3. Calculate the attenuation using the attenuation formula and the distance to see how bright the fragment will be. 4. After calculating the diffuse color as you did for the sun light, multiply this color by the attenuation. ➤ Build and run the app, and you’ll see the full effect of the red point light.
Rendering a point light Remember the sphere is slightly green because of the ambient light.
Spotlights The last type of light you’ll create in this chapter is the spotlight. This sends light rays in limited directions. Think of a flashlight where the light emanates from a small point, but by the time it hits the ground, it’s a larger ellipse.
raywenderlich.com
263
Metal by Tutorials
Chapter 10: Lighting Fundamentals
You define a cone angle to contain the light rays with a cone direction. You also define a cone power to control the attenuation at the edge of the ellipse.
Spotlight angle and attenuation ➤ Open SceneLighting.swift, and add a new light: lazy var spotlight: Light = { var light = Self.buildDefaultLight() light.type = Spot light.position = [-0.64, 0.64, -1.07] light.color = [1, 0, 1] light.attenuation = [1, 0.5, 0] light.coneAngle = Float(40).degreesToRadians light.coneDirection = [0.5, -0.7, 1] light.coneAttenuation = 8 return light }()
This light is similar to the point light with the added cone angle, direction and cone attenuation. ➤ Add the light to lights in init(): lights.append(spotlight)
➤ Open Lighting.metal, and in phongLighting, add this code above break in case Spot: // 1 float d = distance(light.position, position); float3 lightDirection = normalize(light.position - position); // 2 float3 coneDirection = normalize(light.coneDirection); float spotResult = dot(lightDirection, -coneDirection); // 3
raywenderlich.com
264
Metal by Tutorials
Chapter 10: Lighting Fundamentals
if (spotResult > cos(light.coneAngle)) { float attenuation = 1.0 / (light.attenuation.x + light.attenuation.y * d + light.attenuation.z * d * d); // 4 attenuation *= pow(spotResult, light.coneAttenuation); float diffuseIntensity = saturate(dot(lightDirection, normal)); float3 color = light.color * baseColor * diffuseIntensity; color *= attenuation; diffuseColor += color; }
This code is very similar to the point light code. Going through the comments: 1. Calculate the distance and direction as you did for the point light. This ray of light may be outside of the spot cone. 2. Calculate the cosine angle (that’s the dot product) between that ray direction and the direction the spot light is pointing. 3. If that result is outside of the cone angle, then ignore the ray. Otherwise, calculate the attenuation as for the point light. Vectors pointing in the same direction have a dot product of 1.0. 4. Calculate the attenuation at the edge of the spot light using coneAttenuation as the power. ➤ Build and run the app.
Rendering a spotlight Experiment with changing the various attenuations. A cone angle of 5º with attenuation of (1, 0, 0) and a cone attenuation of 1000 will produce a very small targeted soft light; whereas a cone angle of 20º with a cone attenuation of 1 will produce a sharp-edged round light. raywenderlich.com
265
Metal by Tutorials
Chapter 10: Lighting Fundamentals
Key Points • Shading is the reason why objects don’t look flat. Lights provide illumination from different directions. • Normals describe the slope of the curve at a point. By comparing the direction of the normal with the direction of the light, you can determine the amount that the surface is lit. • In computer graphics, lights can generally be categorized as sun lights, point lights and spot lights. In addition, you can have area lights and surfaces can emit light. These are only approximations of real-world lighting scenarios. • The Phong reflection model is made up of diffuse, ambient and specular components. • Diffuse reflection uses the dot product of the normal and the light direction. • Ambient reflection is a value added to all surfaces in the scene. • Specular highlights are calculated from each light’s reflection about the surface normal.
Where to Go From Here? You’ve covered a lot of lighting information in this chapter. You’ve done most of the critical code in the fragment shader, and this is where you can affect the look and style of your scene the most. You’ve done some weird and wonderful calculations by working out dot products between surface normals and various light directions. The formulas you used in this chapter are a small cross-section of computer graphics research that various brilliant mathematicians have come up with over the years. If you want to read more about lighting, you’ll find some interesting internet sites listed in references.markdown in the resources folder for this chapter. In the next chapter, you’ll continue learning another important method of changing how a surface looks with texture maps and materials.
raywenderlich.com
266
Section II: Intermediate Metal
With the basics under your belt, you can move on to multi-pass rendering. You’ll add shadows and learn several new rendering techniques. Programming the GPU using compute shaders can be intimidating, so you’ll create particle systems to learn how fast multi-threaded solutions can be.
raywenderlich.com
267
11
Chapter 11: Maps & Materials
In the previous chapter, you set up a simple Phong lighting model. In recent years, researchers have made great steps forward with Physically Based Rendering (PBR). PBR attempts to accurately represent real-world shading, where the amount of light leaving a surface is less that the amount the surface receives. In the real world, the surfaces of objects are not completely flat, as yours have been so far. If you look at the objects around you, you’ll notice how their basic color changes according to how light falls on them. Some objects have a smooth surface, and some have a rough surface. Heck, some might even be shiny metal! In this chapter, you’ll find out how to use material groups to describe a surface, and how to design textures for micro detail.
raywenderlich.com
268
Metal by Tutorials
Chapter 11: Maps & Materials
Normal Maps The following example best describes normal maps:
An object rendered with a normal map On the left, there’s a lit cube with a color texture. On the right, there’s the same lowpoly cube with the identical color texture and lighting. The only difference is that the cube on the right also has a second texture applied to it known as a normal map. This normal map makes it appear as if the cube is a high-poly model with lots of nooks and crannies. In truth, these high-end details are just an illusion. For this illusion to work, the model needs a texture, like this:
A normal map texture All models have normals that stick out perpendicular to each face. A cube has six faces, and the normal for each face points in a different direction. Also, each face is flat. If you wanted to create the illusion of bumpiness, you’d need to change a normal in the fragment shader. raywenderlich.com
269
Metal by Tutorials
Chapter 11: Maps & Materials
Look at the following image. On the left is a flat surface with normals in the fragment shader. On the right, you see perturbed normals. The texels in a normal map supply the direction vectors of these normals through the RGB channels.
Normals Now, look at this single brick split out into the red, green and blue channels that make up an RGB image.
Normal map channels Each channel has a value between 0 and 1, and you generally visualize them in grayscale as it’s easier to read color values. For example, in the red channel, a value of 0 is no red at all, while a value of 1 is full red. When you convert 0 to an RGB color (0, 0, 0), the result is black. On the opposite spectrum, (1, 1, 1) is white. And in the middle, you have (0.5, 0.5, 0.5), which is mid-gray. In grayscale, all three RGB values are the same, so you only need to refer to a grayscale value by a single float. Take a closer look at the edges of the red channel’s brick. Look at the left and right edges in the grayscale image. The red channel has the darkest color where the normal values of that fragment should point left (-X, 0, 0), and the lightest color where they should point right (+X, 0, 0). Now look at the green channel. The left and right edges have equal value but are different for the top and bottom edges of the brick. The green channel in the grayscale image has darkest for pointing down (0, -Y, 0) and lightest for pointing up (0, +Y, 0). Finally, the blue channel is mostly white in the grayscale image because the brick — except for a few irregularities in the texture — points outward. The edges of the brick are the only places where the normals should point away. raywenderlich.com
270
Metal by Tutorials
Chapter 11: Maps & Materials
Note: Normal maps can be either right-handed or left-handed. Your renderer will expect positive y to be up, but some apps will generate normal maps with positive y down. To fix this, you can take the normal map into Photoshop and invert the green channel. The base color of a normal map — where all normals are “normal” (orthogonal to the face) — is (0.5, 0.5, 1).
A flat normal map This is an attractive color but was not chosen arbitrarily. RGB colors have values between 0 and 1, whereas a model’s normal values are between -1 and 1. A color value of 0.5 in a normal map translates to a model normal of 0. The result of reading a flat texel from a normal map should be a z value of 1 and the x and y values as 0. Converting these values (0, 0, 1) into the colorspace of a normal map results in the color (0.5, 0.5, 1). This is why most normal maps appear bluish.
Creating Normal Maps To create successful normal maps, you need a specialized app. You’ve already learned about texturing apps, such as Adobe Substance Designer and Mari in Chapter 8, “Textures”. Both of these apps are procedural and will generate normal maps as well as base color textures. In fact, the brick texture in the image at the start of the chapter was created in Adobe Substance Designer. Sculpting programs, such as ZBrush, 3D-Coat, Mudbox and Blender will also generate normal maps from your sculpts. You first sculpt a detailed high-poly mesh. And then the app looks at the cavities and curvatures of your sculpt and bakes a normal map. Because high-poly meshes with tons of vertices aren’t resourceefficient in games, you should create a low-poly mesh and then apply the normal map to this mesh.
raywenderlich.com
271
Metal by Tutorials
Chapter 11: Maps & Materials
Photoshop CC (from 2015) and Adobe Substance 3D Sampler can generate a normal map from a photograph or diffuse texture. Because these apps look at the shading and calculate the values, they aren’t as good as the sculpting or procedural apps, but it can be quite amazing to take a photograph of a real-life, personal object, run it through one of these apps, and render out a shaded model. Here’s a normal map that was created using Allegorithmic’s legacy app Bitmap2Material:
A cross photographed and converted into a normal map On the right, the normal map with a white color texture is rendered on to the same cube model as before, with minimal geometry.
Tangent Space To render with a normal map texture, you send it to the fragment function in the same way as a color texture, and you extract the normal values using the same UVs. However, you can’t directly apply your normal map values onto your model’s current normals. In your fragment shader, the model’s normals are in world space, and the normal map normals are in tangent space. Tangent space is a little hard to wrap your head around. Think of the brick cube with all its six faces pointing in different directions. Now think of the normal map with all the bricks the same color on all the six faces.
raywenderlich.com
272
Metal by Tutorials
Chapter 11: Maps & Materials
If a cube face is pointing toward negative x, how does the normal map know to point in that direction?
Normals on a sphere Using a sphere as an example, every fragment has a tangent — that’s the line that touches the sphere at that point. The normal vector in this tangent space is thus relative to the surface. You can see that all of the arrows are at right angles to the tangent. So if you took all of the tangents and laid them out on a flat surface, the blue arrows would point upward in the same direction. That’s tangent space! The following image shows a cube’s normals in world space.
Visualizing normals in world space
raywenderlich.com
273
Metal by Tutorials
Chapter 11: Maps & Materials
To convert the cube’s normals to tangent space, you create a TBN matrix - that’s a Tangent Bitangent Normal matrix that’s calculated from the tangent, bitangent and normal value for each vertex.
The TBN matrix In the TBN matrix, the normal is the perpendicular vector as usual; the tangent is the vector that points along the horizontal surface; and the bitangent is the vector — as calculated by the cross product — that is perpendicular to both the tangent and the normal. Note: The cross product is an operation that gives you a vector perpendicular to two other vectors. The tangent can be at right angles to the normal in any direction. However, to share normal maps across different parts of models, and even entirely different models, there are two standards: 1. The tangent and bitangent will represent the directions that u and v point, respectively, defined in model space. 2. The red channel will represent curvature along u, and the green channel, along v. You could calculate these values when you load the model. However, with Model I/O, as long as you have data for both the position and texture coordinate attributes, Model I/O can calculate and store these tangent and bitangent values at each vertex for you. Finally some code! :] raywenderlich.com
274
Metal by Tutorials
Chapter 11: Maps & Materials
The Starter App ➤ In Xcode, open the starter project for this chapter. There are different models in the project, with accompanying textures in Textures.xcassets. There are two lights in the scene - a sun light and a gentle directional fill light from the back. The code is the same as at the end of the previous chapter with the exception of GameScene and SceneLighting, which simply set up the different models and lighting. Pressing keys “1” and “2” take you to the front and default views respectively. ➤ Build and run the app, and you’ll see a quaint cartoon cottage.
A cartoon cottage It’s a bit plain, but you’re going to add a normal map to help give it some surface details.
Using Normal Maps ➤ In the Models ▸ Cottage group, open cottage1.mtl in a text editor. There are two textures needed to render this cottage: map_tangentSpaceNormal cottage-normal map_Kd cottage-color map_Kd defines the color map, and map_tangentSpaceNormal defines the normal
map. cottage-color and cottage-normal are textures in Textures.xcassets.
raywenderlich.com
275
Metal by Tutorials
Chapter 11: Maps & Materials
The normal map holds data instead of color. Later, you’ll use other texture maps for other surface qualities. Looking at textures like cottage-normal in a photo editor, you’d think they are color, but the trick is to regard the RGB values as numerical data instead of color data. ➤ In the Geometry group, open Submesh.swift, and add a new property to Submesh.Textures: let normal: MTLTexture?
➤ At the end of SubMesh.Textures.init(material:), read in this texture: normal = property(with: .tangentSpaceNormal)
This is the normal map property that Model I/O expects to read in from the .mtl file. ➤ In the Shaders group, open Common.h, and add this to TextureIndices: NormalTexture = 1
You’ll send the normal texture to the fragment shader using this index. ➤ Open Model.swift, and in render(encoder:uniforms:params:), locate where you set the base color texture inside for submesh in mesh.submeshes. ➤ Add this: encoder.setFragmentTexture( submesh.textures.normal, index: NormalTexture.index)
Here, you send the normal texture to the GPU. ➤ Open Shaders.metal, and in fragment_main, add the normal texture to the list of parameters: texture2d normalTexture [[texture(NormalTexture)]]
Now that you’re transferring the normal texture map, the first step is to apply it to the cottage as if it were a color texture. ➤ In fragment_main, before calling phongLighting, add this: float3 normal; if (is_null_texture(normalTexture)) { normal = in.worldNormal;
raywenderlich.com
276
Metal by Tutorials
Chapter 11: Maps & Materials
} else { normal = normalTexture.sample( textureSampler, in.uv * params.tiling).rgb; } normal = normalize(normal); return float4(normal, 1);
This reads in normalValue from the texture, if there is one. If there is no normal map texture for this model, set the default normal value. The return is only temporary to make sure the app is loading the normal map correctly, and that the normal map and UVs match. ➤ Build and run to verify the normal map is providing the fragment color.
The normal map applied as a color texture You can see all the surface details the normal map will provide. There are scattered bricks on the wall, wood grain on the door and windows and a shingle-looking roof. ➤ Excellent! You tested that the normal map loads, so remove this from fragment_main: return float4(normal, 1);
You may have noticed that in the normal map’s bricks, along the main surface of the house, the red seems to point along negative y, and the green seems to map to negative x.
raywenderlich.com
277
Metal by Tutorials
Chapter 11: Maps & Materials
You might expect that red (1, 0, 0) maps to x and green (0, 1, 0) maps to y. This happens because the UV island for the main part of the house is rotated 90 degrees counterclockwise.
A normal map overlaid with UVs Not to worry, the mesh’s stored tangents will map everything correctly. They take UV rotation into account. Don’t celebrate just yet. You still have several tasks ahead of you. You still need to: 1. Load tangent and bitangent values using Model I/O. 2. Tell the render command encoder to send the newly created MTLBuffers containing the values to the GPU. 3. In the vertex shader, change the values to world space — just as you did normals — and pass the new values to the fragment shader. 4. Calculate the new normal based on these values.
1. Load Tangents and Bitangents ➤ Open VertexDescriptor.swift, and look at MDLVertexDescriptor’s defaultLayout. Here, you tell the vertex descriptor that there are normal values in the attribute named MDLVertexAttributeNormal.
raywenderlich.com
278
Metal by Tutorials
Chapter 11: Maps & Materials
So far, your models have normal values included with them, but you may come across odd files where you have to generate normals. You can also override how the modeler smoothed the model. For example, the house model has smoothing applied in Blender so that the roof, which has very few faces, does not appear too blocky.
Flat vs smooth shaded Smoothing recalculates vertex normals so that they interpolate smoothly over a surface. Blender stores smoothing groups in the .obj file, which Model I/O reads in and understands. Notice in the above image, that although the surface of the sphere is smooth, the edges are unchanged and pointy. Smoothing only changes the way the renderer evaluates the surface. Smoothing does not change the geometry. Try reloading vertex normals and overriding the smoothing. ➤ Open Model.swift, and in init(name:), replace: let (mdlMeshes, mtkMeshes) = try! MTKMesh.newMeshes( asset: asset, device: Renderer.device)
With: var mtkMeshes: [MTKMesh] = [] let mdlMeshes = asset.childObjects(of: MDLMesh.self) as? [MDLMesh] ?? [] _ = mdlMeshes.map { mdlMesh in mdlMesh.addNormals( withAttributeNamed: MDLVertexAttributeNormal, creaseThreshold: 1.0) mtkMeshes.append( try! MTKMesh( mesh: mdlMesh, device: Renderer.device)) }
raywenderlich.com
279
Metal by Tutorials
Chapter 11: Maps & Materials
You’re now loading the MDLMeshes first and changing them before initializing the MTKMeshes. You ask Model I/O to recalculate normals with a crease threshold of 1. This crease threshold, between 0 and 1, determines the smoothness, where 1.0 is unsmoothed. ➤ Build and run the app.
An unsmoothed model The cottage is now completely unsmoothed, and you can see all of its separate faces. If you were to try a creaseThreshold of zero, where everything is smoothed, you’d get some rendering artifacts because of the surfaces rounding too far. When dealing with smoothness remember this: Smoothness is good, but use it with caution. The artist needs to set up the model with smoothing in mind. ➤ Remove the line you just added that reads: mdlMesh.addNormals( withAttributeNamed: MDLVertexAttributeNormal, creaseThreshold: 1.0)
➤ Replace it with this: mdlMesh.addTangentBasis( forTextureCoordinateAttributeNamed: MDLVertexAttributeTextureCoordinate, tangentAttributeNamed: MDLVertexAttributeTangent, bitangentAttributeNamed: MDLVertexAttributeBitangent)
All the supplied models have normals provided by Blender, but not tangents and bitangents. This new code generates and loads the vertex tangent and bitangent values.
raywenderlich.com
280
Metal by Tutorials
Chapter 11: Maps & Materials
Model I/O does a few things behind the scenes: • Add two named attributes to mdlMesh’s vertex descriptor: MDLVertexAttributeTangent and MDLVertexAttributeBitangent. • Calculate the tangent and bitangent values. • Create two new MTLBuffers to contain them. • Update the layout strides on mdlMesh’s vertex descriptor to match the two new buffers. With the addition of these two new attributes, you should update the default vertex descriptor so that the pipeline state in Renderer uses the same vertex descriptor. First, define the new buffer attribute and buffer indices. ➤ Open Common.h and add this to Attributes: Tangent = 4, Bitangent = 5
➤ Add the indices to BufferIndices: TangentBuffer = 3, BitangentBuffer = 4,
➤ Open VertexDescriptor.swift, and add this to MDLVertexDescriptor’s defaultLayout before return: vertexDescriptor.attributes[Tangent.index] = MDLVertexAttribute( name: MDLVertexAttributeTangent, format: .float3, offset: 0, bufferIndex: TangentBuffer.index) vertexDescriptor.layouts[TangentBuffer.index] = MDLVertexBufferLayout(stride: MemoryLayout.stride) vertexDescriptor.attributes[Bitangent.index] = MDLVertexAttribute( name: MDLVertexAttributeBitangent, format: .float3, offset: 0, bufferIndex: BitangentBuffer.index) vertexDescriptor.layouts[BitangentBuffer.index] = MDLVertexBufferLayout(stride: MemoryLayout.stride)
raywenderlich.com
281
Metal by Tutorials
Chapter 11: Maps & Materials
When you create the pipeline state in Renderer, the pipeline descriptor will now notify the GPU that it needs to create space for these two extra buffers. It’s important that you remember that your model’s vertex descriptor layout must match the one in the render encoder’s pipeline state. In addition, your shader’s VertexIn attributes should also match the vertex descriptor. Note: So far, you’ve only created one pipeline descriptor for all models. But often models will require different vertex layouts. Or if some of your models don’t contain normals, colors and tangents, you might wish to save on creating buffers for them. You can create multiple pipeline states for the different vertex descriptor layouts, and replace the render encoder’s pipeline state before drawing each model. ➤ Build and run the app to make sure your cottage still renders.
The cottage - no changes yet You’ve completed the necessary updates to the model’s vertex layouts, and now you’ll update the rendering code to match.
2. Send Tangent and Bitangent Values to the GPU ➤ Open Model.swift, and in render(encoder:uniforms:params:), locate for mesh in meshes. For each mesh, you’re currently sending all the vertex buffers to the GPU: for (index, vertexBuffer) in mesh.vertexBuffers.enumerated() { encoder.setVertexBuffer(
raywenderlich.com
282
Metal by Tutorials
}
Chapter 11: Maps & Materials
vertexBuffer, offset: 0, index: index)
This code includes sending the tangent and bitangent buffers. You should be aware of the number of buffers that you send to the GPU. In Common.h, you’ve set up UniformsBuffer as index 11, but if you had defined that as index 4, you’d now have a conflict with the bitangent buffer.
3. Convert Tangent and Bitangent Values to World Space Just as you converted the model’s normals to world space, you need to convert the tangents and bitangents to world space in the vertex function. ➤ Open Shaders.metal, and add these new attributes to VertexIn: float3 tangent [[attribute(Tangent)]]; float3 bitangent [[attribute(Bitangent)]];
➤ Add new properties to VertexOut so that you can send the values to the fragment function: float3 worldTangent; float3 worldBitangent;
➤ In vertex_main after calculating out.worldNormal, add this: .worldTangent = uniforms.normalMatrix * in.tangent, .worldBitangent = uniforms.normalMatrix * in.bitangent
This code moves the tangent and bitangent values into world space.
4. Calculate the New Normal Now that you have everything in place, it’ll be a simple matter to calculate the new normal. Before doing the normal calculation, consider the normal color value that you’re reading. Colors are between 0 and 1, but normal values range from -1 to 1. ➤ Still in Shaders.metal, in fragment_main, locate where you read normalTexture. In the else part of the conditional, after reading the normal from the texture, add: raywenderlich.com
283
Metal by Tutorials
Chapter 11: Maps & Materials
normal = normal * 2 - 1;
This code redistributes the normal value to be within the range -1 to 1. ➤ After the previous code, still inside the else part of the conditional, add this: normal = float3x3( in.worldTangent, in.worldBitangent, in.worldNormal) * normal;
This code recalculates the normal direction into tangent space to match the tangent space of the normal texture. ➤ Replace the color definition with: float3 color = phongLighting( normal, in.worldPosition, params, lights, baseColor );
You send phongLighting your newly calculated normal. ➤ Remove normalDirection since you no longer need it: float3 normalDirection = normalize(in.worldNormal);
➤ Build and run the app to see the normal map applied to the cottage.
The cottage with a normal map applied
raywenderlich.com
284
Metal by Tutorials
Chapter 11: Maps & Materials
As you rotate the cottage, notice how the lighting affects the small cavities on the model, especially on the door and roof where the specular light falls — it’s almost like you created new geometry, but you didn’t. That’s the magic of normal apps: Adding amazing detail to simple low-poly models.
Other Texture Map Types Normal maps are not the only way of changing a model’s surface. There are other texture maps: • Roughness: Describes the smoothness or roughness of a surface. You’ll add a roughness map shortly. • Metallic: White for metal and black for dielectric. Metal is a conductor of electricity, whereas a dielectric material is a non-conductor. • Ambient Occlusion: Describes areas that are occluded; in other words, areas that are hidden from light. • Reflection: Identifies which part of the surface is reflective. • Opacity: Describes the location of the transparent parts of the surface. In fact, any value (thickness, curvature, etc.) that you can think of to describe a surface, can be stored in a texture. You just look up the relevant fragment in the texture using the UV coordinates and use the value recovered. That’s one of the bonuses of writing your own renderer. You can choose what maps to use and how to apply them. You can use all of these textures in the fragment shader, and the geometry doesn’t change. Note: A displacement or height map can change geometry. You’ll read about displacement in Chapter 19, “Tessellation & Terrains”.
Materials Not all models have textures. For example, the train you rendered earlier in the book has different material groups that specify a color instead of using a texture.
raywenderlich.com
285
Metal by Tutorials
Chapter 11: Maps & Materials
In the Models ▸ Cottage group, open cottage1.mtl in a text editor. This is the file that describes the visual aspects of the cottage model. There are several material groups: • glass • roof • wall • wood Each of these groups contain different material properties. Although you’re loading the same diffuse texture for all of them, you could use different textures for each group. In the case of a property not having an associated texture, you can extract the values: • Ns: Specular exponent (shininess) • Kd: Diffuse color • Ks: Specular color Note: You can find a full list of the definitions at https://bit.ly/3HBVnGn. The current .mtl file loads the color using map_Kd, but for experimentation, you’ll switch the rendered cottage file to one that gets its color from the material group and not a texture. ➤ Open cottage2.mtl, and see that none of the groups has a map_Kd property. ➤ Open GameScene.swift, and change cottage to use "cottage2.obj" instead of "cottage1.obj". The diffuse color won’t be the only material property you’ll be reading. ➤ Open Common.h, and add a new structure to hold material values: typedef struct { vector_float3 baseColor; vector_float3 specularColor; float roughness; float metallic; float ambientOcclusion;
raywenderlich.com
286
Metal by Tutorials
Chapter 11: Maps & Materials
float shininess; } Material;
There are more material properties available, but these are the most common. For now, you’ll read in baseColor, specularColor and shininess. ➤ Open Submesh.swift, and create a new property in Submesh under textures to hold the materials: let material: Material
Your project won’t compile until you’ve initialized material. ➤ At the bottom of Submesh.swift, create a new Material extension with initializer: private extension Material { init(material: MDLMaterial?) { self.init() if let baseColor = material?.property(with: .baseColor), baseColor.type == .float3 { self.baseColor = baseColor.float3Value } } }
In Submesh.Textures, you read in string values for the textures’ file names from the submesh’s material properties. If there’s no texture available for a particular property, you can use the material base color. For example, if an object is solid red, you don’t have to go to the trouble of making a texture, you can just use the material’s base color of float3(1, 0, 0) to describe the color. ➤ Add the following code to the end of Material’s init(material:): if let specular = material?.property(with: .specular), specular.type == .float3 { self.specularColor = specular.float3Value } if let shininess = material?.property(with: .specularExponent), shininess.type == .float { self.shininess = shininess.floatValue } self.ambientOcclusion = 1
Here, you read the specular and shininess values from the submesh’s materials. Currently you’re not loading or using ambient occlusion, but the default value should be 1.0 (white). raywenderlich.com
287
Metal by Tutorials
Chapter 11: Maps & Materials
➤ In Submesh, in init(mdlSubmesh:mtkSubmesh:) and after initializing textures, initialize material: material = Material(material: mdlSubmesh.material)
You’ll now send this material to the shader. This sequence of coding should be familiar to you by now. ➤ Open Common.h, and add another index to BufferIndices: MaterialBuffer = 14
➤ Open Model.swift. In render(encoder:uniforms:params:), inside for submesh in mesh.submeshes where you call setFragmentTexture, add the following: var material = submesh.material encoder.setFragmentBytes( &material, length: MemoryLayout.stride, index: MaterialBuffer.index)
This code sends the material structure to the fragment shader. As long as your material structure stride is less than 4k bytes, then you don’t need to create and hold a special buffer. ➤ Open Shaders.metal, and add the following as a parameter of fragment_main: constant Material &_material [[buffer(MaterialBuffer)]],
You pass the model’s material properties to the fragment shader. You use _ in front of the name, as _material is constant, and soon you’ll need to update the structure with the texture’s base color if there is one. ➤ At the top of fragment_main, add this: Material material = _material;
➤ In fragment_main, replace: float3 baseColor; if (is_null_texture(baseColorTexture)) { baseColor = in.color; } else { baseColor = baseColorTexture.sample( textureSampler,
raywenderlich.com
288
Metal by Tutorials
}
Chapter 11: Maps & Materials
in.uv * params.tiling).rgb;
With: if (!is_null_texture(baseColorTexture)) { material.baseColor = baseColorTexture.sample( textureSampler, in.uv * params.tiling).rgb; }
If the texture exists, replace the material base color with the color extracted from the texture. Otherwise, you’ve already loaded the base color in material. ➤ Still in fragment_main, replace baseColor with material in phongLighting’s arguments: float3 color = phongLighting( normal, in.worldPosition, params, lights, material );
Your project won’t compile until you’ve updated phongLighting to match these parameters. ➤ Open Lighting.h, and replace float3 baseColor with: Material material
➤ Open Lighting.metal, and in phongLighting’s parameters, replace the parameter float3 baseColor with: Material material
You’re now sending material to phongLighting instead of just the base color, so you’ll be able to render the appropriate material properties for each submesh.
raywenderlich.com
289
Metal by Tutorials
Chapter 11: Maps & Materials
➤ Add the following code to the top of phongLighting: float3 baseColor = material.baseColor;
➤ Replace the assignments of materialShininess and materialSpecularColor with: float materialShininess = material.shininess; float3 materialSpecularColor = material.specularColor;
➤ Build and run the app, and you’re now loading cottage2 with the colors coming from the material Kd values instead of a texture.
Using color from the surface material As you rotate the cottage, you can see the roof, door and window frames are shiny with strong specular highlights. ➤ In the Models ▸ Cottage group, open cottage2.mtl in a text editor, and in both the roof and wood groups, change: • Ns: to 1.0; and • Ks: to 0.2 0.2 0.2 These changes eliminate the specular highlights for those two groups.
raywenderlich.com
290
Metal by Tutorials
Chapter 11: Maps & Materials
➤ Build and run the app to see the difference:
With and without specular highlight You can now render models with or without textures, by reading in the values in the .mtl file. You’ve also found that it’s very easy to change material values by editing them in the .mtl file. As you can see, models have various requirements. Some models need a color texture; some models need a roughness texture; and some models need normal maps. It’s up to you to check conditionally in the fragment function whether there are textures or constant material values.
Physically Based Rendering (PBR) To achieve spectacular scenes, you need to have good textures, but shading plays an even more significant role. In recent years, the concept of PBR has replaced the simplistic Phong shading model. As its name suggests, PBR attempts physically realistic interaction of light with surfaces. Now that Augmented Reality has become part of our lives, it’s even more important to render your models to match their physical surroundings. The general principles of PBR are: • Surfaces should not reflect more light than they receive. • Surfaces can be described with known, measured physical properties.
raywenderlich.com
291
Metal by Tutorials
Chapter 11: Maps & Materials
The Bidirectional Reflectance Distribution Function (BRDF) defines how a surface responds to light. There are various highly mathematical BRDF models for both diffuse and specular, but the most common are Lambertian diffuse; and for the specular, variations on the Cook-Torrance model (presented at SIGGRAPH 1981). This takes into account: • micro-facet slope distribution: You learned about micro-facets and how light bounces off surfaces in many directions in Chapter 10, “Lighting Fundamentals”. • Fresnel: If you look straight down into a clear lake, you can see through it to the bottom, however, if you look across the surface of the water, you only see a reflection like a mirror. This is the Fresnel effect, where the reflectivity of the surface depends upon the viewing angle. • geometric attenuation: Self-shadowing of the micro-facets. Each of these components have different approximations, or models written by many clever people. It’s a vast and complex topic. In the resources folder for this chapter, references.markdown contains a few places where you can learn more about physically based rendering and the calculations involved. You’ll also learn some more about BRDF and Fresnel in Chapter 21, “Image-Based Lighting”. Artists generally provide some textures with their models that supply the BRDF values. These are the most common: • Albedo: You already met the albedo map in the form of the base color map. Albedo is originally an astronomical term describing the measurement of diffuse reflection of solar radiation, but it has come to mean in computer graphics the surface color without any shading applied to it. • Metallic: A surface is either a conductor of electricity — in which case it’s a metal; or it isn’t a conductor — in which case it’s a dielectric. Most metal textures consist of 0 (black) and 1 (white) values only: 0 for dielectric and 1 for metal. • Roughness: A grayscale texture that indicates the shininess of a surface. White is rough, and black is smooth. If you have a scratched shiny surface, the texture might consist of mostly black or dark gray with light gray scratch marks. • Ambient Occlusion: A grayscale texture that defines how much light reaches a surface. For example, less light will reach nooks and crannies. Included in the starter project is a fragment function that uses a Cook-Torrance model for specular lighting. It takes as input the above textures, as well as the color and normal textures.
raywenderlich.com
292
Metal by Tutorials
Chapter 11: Maps & Materials
PBR Workflow First, change the fragment function to use the PBR calculations. ➤ Open Renderer.swift, and in init(metalView:options:), change the name of the fragment function from "fragment_main" to "fragment_PBR". ➤ In the Shaders group, open PBR.metal. In the File inspector, add the file to the macOS and iOS targets.
Target membership ➤ Examine fragment_PBR. The function starts off similar to your previous fragment_main but with a few more texture parameters in the function header. The function extracts values from textures when available, and calculates the normals the same as previously. For simplicity, it only processes the first light in the lights array. This is the main sun light. fragment_PBR calls computeSpecular that works through a Cook-Torrance shading model to calculate the specular highlight. Finally, it calls computeDiffuse to
produce the diffuse color. The final color is the result of adding together the diffuse color and the specular highlight.
raywenderlich.com
293
Metal by Tutorials
Chapter 11: Maps & Materials
To add all of the PBR textures to your project is quite long-winded, so here you’ll only add roughness. You’ll add metallic and ambient occlusion in the challenge. ➤ Open Submesh.swift, and create a new property for roughness in Submesh.Textures: let roughness: MTLTexture?
➤ In the Submesh.Textures extension, add the following code to the end of init(material:): roughness = property(with: .roughness)
In addition to reading in a possible roughness texture, you need to read in the material value too. ➤ At the bottom of Material’s init(material:), add: if let roughness = material?.property(with: .roughness), roughness.type == .float3 { self.roughness = roughness.floatValue }
➤ Open Model.swift, and in render(encoder:uniforms:params:), locate where you send the base color and normal textures to the fragment function, then add this code afterward: encoder.setFragmentTexture( submesh.textures.roughness, index: 2)
➤ Open GameScene.swift, and change the name of the cottage model to “cube.obj”. ➤ In init(), change the camera distance and target to: camera.distance = 3.5 camera.target = .zero
These values fit viewing the shape and size of the cube better. ➤ Open cube.mtl in the Models ▸ Cube group in a text editor. The roughness and normal maps are commented out with a #. The default roughness value is 1.0, which is completely rough.
raywenderlich.com
294
Metal by Tutorials
Chapter 11: Maps & Materials
➤ Build and run the app to see a cube with only an albedo texture applied.
A cube with albedo map applied This texture has no lighting information baked into it. Textures altering the surface will change the lighting appropriately. ➤ In cube.mtl, remove the # in front of map_tangentSpaceNormal cube-normal. ➤ Build and run the app again to see the difference when the normal texture is applied.
A cube with normal and albedo maps applied ➤ Still in cube.mtl, remove the # in front of map_roughness cube-roughness. ➤ In the Textures group, open Textures.xcassets, and select cube-roughness. Select the image and press the spacebar to preview it. The dark gray values will be smooth and shiny (exaggerated here for effect), and the white mortar between the bricks will be completely rough (not shiny). raywenderlich.com
295
Metal by Tutorials
Chapter 11: Maps & Materials
The cube's texture maps Compare the roughness map to the cube’s color and normal maps to see how the model’s UV layout is used for all the textures. ➤ Build and run the app to see the PBR function in action. Admire how much you can affect how a model looks just by a few textures and a bit of fragment shading.
The final rendered cube
Channel Packing Later, you’ll be using the PBR fragment function for rendering. Even if you don’t understand the mathematics, understand the layout of the function and the concepts used. When loading models built by various artists, you’re likely going to come up against a variety of standards. Textures may be a different way up; normals might point in a different direction; sometimes you may even find three textures magically contained in a single file, a technique known as channel packing. Channel packing is an efficient way of managing external textures. raywenderlich.com
296
Metal by Tutorials
Chapter 11: Maps & Materials
To understand how it works, open PBR.metal and look at the code where the fragment function reads single floats: roughness, metallic and ambient occlusion. When the function reads the texture for each of these values, it’s only reading the red channel. For example: roughness = roughnessTexture.sample(textureSampler, in.uv).r;
Available within the roughness file are green and blue channels that are currently unused. As an example, you could use the green channel for metallic and the blue channel for ambient occlusion. Included in the resources folder for this chapter is an image named channelpacked.png. If you have Photoshop or some other graphics application capable of reading individual channels, open this file and inspect the channels.
Different channels in Photoshop A different color channel contains each of the words. Similarly, you can load your different grayscale maps to each color channel. If you receive a file like this, you can split each channel into a different file by hiding channels and saving the new file. If you’re organizing your maps through an asset catalog, channel packing won’t impact the memory consumption and you won’t gain much advantage. However, some artists do use it for easy texture management.
raywenderlich.com
297
Metal by Tutorials
Chapter 11: Maps & Materials
Challenge In the resources folder for this chapter is a fabulous helmet model from Malopolska’s Virtual Museums collection at sketchfab.com. Your challenge is to render this model. There are five textures that you’ll load into the asset catalog. Don’t forget to change Interpretation from Color to Data, so the textures don’t load as sRGB. Just as you did with the roughness texture, you’ll add metallic and ambient occlusion textures to your code. You should also update TextureIndices with the correct buffer index numbers.
Rendering a helmet If you get stuck, you’ll find the finished project in the challenge folder. The challenge project can also render USDZ files with textures. When Model I/O loads USDZ files, the textures are loaded as MDLTextures instead of string filenames. In the challenge project there is an additional method in TextureController to cope with this, as well as extra functionality in Submesh.Textures. To load these textures, you also have to preload the asset textures when you load the asset in Model, by using asset.loadTextures(). You can download USDZ samples from Apple’s AR Quick Look Gallery (https:// apple.co/3qnxbQa) to try. The animated models, such as the toy robot, still won’t work properly until after you’ve completed the animation chapters, but the static models, such as the car and the teapot, should render with textures once you scale the model down to 0.1.
raywenderlich.com
298
Metal by Tutorials
Chapter 11: Maps & Materials
Where to Go From Here? Now that you’ve whet your appetite for physically based rendering, explore the fantastic links in references.markdown, which you’ll find in the resources folder for this chapter. Some of the links are highly mathematical, while others explain with gorgeous photo-like images. Apple’s sample code Using Function Specialization to Build Pipeline Variants (https://apple.co/3mvBUhM) is a fantastic piece of sample code to examine, complete with a gorgeous fire truck model. It uses function constants for creating different levels of detail depending on distance from the camera. As a further challenge, you can import the sample’s fire truck into your renderer to see how it looks. Remember, though, that you haven’t yet implemented great lighting and reflection. In Chapter 21, “Image-Based Lighting”, you’ll explore how to light your scene with reflection from a skycube texture. Metallic objects look much more realistic when they have something to reflect.
raywenderlich.com
299
12
Chapter 12: Render Passes
Up to this point, you’ve created projects that had only one render pass. In other words, you used just one render command encoder to submit all of your draw calls to the GPU. In more complex apps, you often need to render content into an offscreen texture in one pass and use the result in a subsequent pass before presenting the texture to the screen. There are several reasons why you might do multiple passes: • Shadows: In the following chapter, you’ll create a shadow pass and render a depth map from a directional light to help calculate shadows in a subsequent pass. • Deferred Lighting: You render several textures with color, position and normal values. Then, in a final pass, you calculate lighting using those textures. • Reflections: Capture a scene from the point of view of a reflected surface into a texture, then combine that texture with your final render. • Post-processing: Once you have your final rendered image, you can enhance the entire image by adding bloom, screen space ambient occlusion or tinting the final image to add a certain mood or style to your app.
raywenderlich.com
300
Metal by Tutorials
Chapter 12: Render Passes
Render Passes A render pass consists of sending commands to a command encoder. The pass ends when you end encoding on that command encoder. When setting up a render command encoder, you use a render pass descriptor. So far, you’ve used the MTKView currentRenderPassDescriptor, but you can define your own descriptor or make changes to the current render pass descriptor. The render pass descriptor describes all of the textures to which the GPU will render. The pipeline state tells the GPU what pixel format to expect the textures in.
A render pass
raywenderlich.com
301
Metal by Tutorials
Chapter 12: Render Passes
For example, the following render pass writes to four textures. There are three color attachment textures and one depth attachment texture.
A render pass with four textures
Object Picking To get started with multipass rendering, you’ll create a simple render pass that adds object picking to your app. When you click a model in your scene, that model will render in a slightly different shade. There are several ways to hit-test rendered objects. For example, you could do the math to convert the 2D touch location to a 3D ray and then perform ray intersection to see which object intersects the ray. Warren Moore describes this method in his Picking and Hit-Testing in Metal (https://bit.ly/3rlzm9b) article. Alternatively, you could render a texture where each object is rendered in a different color or object ID. Then, you calculate the texture coordinate from the screen touch location and read the texture to see which object was hit. You’re going to store the model’s object ID into a texture in one render pass. You’ll then send the touch location to the fragment shader in the second render pass and read the texture from the first pass. If the fragment being rendered is from the selected object, you’ll render that fragment in a different color. raywenderlich.com
302
Metal by Tutorials
Chapter 12: Render Passes
The Starter App ➤ In Xcode, open the starter app for this chapter and examine the code. It’s similar to the previous chapter but refactored. • In the Render Passes group, ForwardRenderPass.swift contains the rendering code that used to be in Renderer along with the pipeline state and depth stencil state initialization. Separating this code will make it easier to have multiple render passes because you can then concentrate on getting the pipeline states and textures correct for each pass. In Renderer, draw(scene:in:) updates the uniforms, then tells the forward render pass to draw the scene. • Pipelines.swift contains pipeline state creation. Later, PipelineStates will contain several more pipeline states. • In the Game group, GameScene sets up new models in a scene. • In the Geometry group, Model now has an objectId. When GameScene creates the model, it allocates a unique object ID. Model updates params with its objectId for the fragment function. • In the SwiftUI Views group, MetalView has a gesture that forwards the mouse or touch location to InputController when the user clicks or taps the screen. • In the Shaders group, Vertex.h now contains VertexIn and VertexOut. .metal files include this header where necessary. ➤ Build and run the app, and familiarize yourself with the code.
The starter app
raywenderlich.com
303
Metal by Tutorials
Chapter 12: Render Passes
Setting up Render Passes Since you’ll have multiple render passes performing similar procedures, it makes sense to have a protocol with some default methods. ➤ In Render Passes, create a new Swift file named RenderPass.swift, and replace the code with: import MetalKit protocol RenderPass { var label: String { get } var descriptor: MTLRenderPassDescriptor? { get set } mutating func resize(view: MTKView, size: CGSize) func draw( commandBuffer: MTLCommandBuffer, scene: GameScene, uniforms: Uniforms, params: Params ) } extension RenderPass { }
All render passes will have a render pass descriptor. The pass might create its own descriptor or use the view’s current render pass descriptor. They’ll all need to resize the render textures when the user resizes the window. All render passes will need a draw method. The extension will hold default render pass methods. ➤ Open ForwardRenderPass.swift, and conform ForwardRenderPass to RenderPass: struct ForwardRenderPass: RenderPass {
➤ Cut buildDepthStencilState() from ForwardRenderPass, and paste it into RenderPass’s extension. Multiple render passes will use this depth stencil state initialization method.
raywenderlich.com
304
Metal by Tutorials
Chapter 12: Render Passes
Creating a UInt32 Texture Textures don’t only hold color. There are many pixel formats (https://apple.co/ 3Eby9oD). So far, you’ve used rgba8Unorm, a color format that contains four 8-bit integers for red, green, blue and alpha. Model‘s objectId is a UInt32, and in place of the model’s color, you’ll render its ID to a texture. You’ll create a texture that holds UInt32s in a new render pass.
➤ In Render Passes, create a new Swift file named ObjectIdRenderPass.swift and replace the code with: import MetalKit struct ObjectIdRenderPass: RenderPass { let label = "Object ID Render Pass" var descriptor: MTLRenderPassDescriptor? var pipelineState: MTLRenderPipelineState mutating func resize(view: MTKView, size: CGSize) { }
}
func draw( commandBuffer: MTLCommandBuffer, scene: GameScene, uniforms: Uniforms, params: Params ) { }
Here, you create the render pass with the required properties and methods to conform to RenderPass, along with a pipeline state object. ➤ Open Pipelines.swift, and add a method to PipelineStates to create the pipeline state object: static func createObjectIdPSO() -> MTLRenderPipelineState { let pipelineDescriptor = MTLRenderPipelineDescriptor() // 1 let vertexFunction = Renderer.library?.makeFunction(name: "vertex_main") let fragmentFunction = Renderer.library?.makeFunction(name: "fragment_objectId") pipelineDescriptor.vertexFunction = vertexFunction pipelineDescriptor.fragmentFunction = fragmentFunction // 2 pipelineDescriptor.colorAttachments[0].pixelFormat = .r32Uint // 3
raywenderlich.com
305
Metal by Tutorials
}
Chapter 12: Render Passes
pipelineDescriptor.depthAttachmentPixelFormat = .invalid pipelineDescriptor.vertexDescriptor = MTLVertexDescriptor.defaultLayout return Self.createPSO(descriptor: pipelineDescriptor)
Most of this code will be familiar to you, but there are some details to note: 1. You can use the same vertex function as you did to render the model because you’ll render the vertices in the same position. However, you’ll need a different fragment function to write the ID to the texture. 2. The color attachment’s texture pixel format is a 32-bit unsigned integer. The GPU will expect you to hand it a texture in this format. 3. You’ll come back and add a depth attachment, but for now, leave it invalid, which means that the GPU won’t require a depth texture. ➤ Open ObjectIdRenderPass.swift, and create an initializer: init() { pipelineState = PipelineStates.createObjectIdPSO() descriptor = MTLRenderPassDescriptor() }
Here, you initialize the pipeline state and the render pass descriptor. Most render passes will require you to create a texture, so you’ll create one that takes several different parameters. ➤ Open RenderPass.swift, and add a new method to the extension: static func makeTexture( size: CGSize, pixelFormat: MTLPixelFormat, label: String, storageMode: MTLStorageMode = .private, usage: MTLTextureUsage = [.shaderRead, .renderTarget] ) -> MTLTexture? { }
In addition to a size, you’ll give the texture: • A pixel format, such as rgba8Unorm. In this render pass, you give itr32Uint. • By default, the storage mode is private, meaning the texture stores in memory in a place that only the GPU can access.
raywenderlich.com
306
Metal by Tutorials
Chapter 12: Render Passes
• The usage. You have to configure textures used by render pass descriptors as render targets. Render targets are memory buffers or textures that allow offscreen rendering for cases where the rendered pixels don’t need to end up in the framebuffer. You’ll also want to read the texture in shader functions, so you set up that default capability, too. ➤ Add this code to makeTexture(size:pixelFormat:label:storageMode:usage:): let width = Int(size.width) let height = Int(size.height) guard width > 0 && height > 0 else { return nil } let textureDesc = MTLTextureDescriptor.texture2DDescriptor( pixelFormat: pixelFormat, width: width, height: height, mipmapped: false) textureDesc.storageMode = storageMode textureDesc.usage = usage guard let texture = Renderer.device.makeTexture(descriptor: textureDesc) else { fatalError("Failed to create texture") } texture.label = label return texture
You configure a texture descriptor using the given parameters and create a texture from the descriptor. ➤ Open ObjectIdRenderPass.swift, and add a new property to ObjectIdRenderPass for the render texture: var idTexture: MTLTexture?
➤ Add this code to resize(view:size:): idTexture = Self.makeTexture( size: size, pixelFormat: .r32Uint, label: "ID Texture")
Every time the view size changes, you’ll rebuild the texture to match the view’s size.
raywenderlich.com
307
Metal by Tutorials
Chapter 12: Render Passes
Now for the draw. ➤ Add this code to draw(commandBuffer:scene:uniforms:params:): guard let descriptor = descriptor else { return } descriptor.colorAttachments[0].texture = idTexture guard let renderEncoder = commandBuffer.makeRenderCommandEncoder(descriptor: descriptor) else { return }
You assign idTexture to the descriptor’s first color attachment. You then create the render command encoder using this descriptor. The pixel format must match the render target textures when configuring the color attachment for the pipeline state object. In this case, you set them both to r32Uint. ➤ Add this code after the code you just added: renderEncoder.label = label renderEncoder.setRenderPipelineState(pipelineState) for model in scene.models { model.render( encoder: renderEncoder, uniforms: uniforms, params: params) } renderEncoder.endEncoding()
Here, you set the pipeline state and render the models.
Adding the Render Pass to Renderer ➤ Open Renderer.swift, and add the new render pass property: var objectIdRenderPass: ObjectIdRenderPass
➤ In init(metalView:options:), add this code before super.init() to initialize the render pass: objectIdRenderPass = ObjectIdRenderPass()
raywenderlich.com
308
Metal by Tutorials
Chapter 12: Render Passes
➤ Add this code to mtkView(_:drawableSizeWillChange:): objectIdRenderPass.resize(view: view, size: size)
Here, you ensure that idTexture’s size matches the view size. Renderer’s initializer calls mtkView(_:drawableSizeWillChange), so your texture in the render pass is initialized and sized appropriately. ➤ Add this code to draw(scene:in:) immediately after updateUniforms(scene: scene): objectIdRenderPass.draw( commandBuffer: commandBuffer, scene: scene, uniforms: uniforms, params: params)
Excellent, you’ve set up the render pass. Now all you have to do is create the fragment shader function to write to idTexture.
Adding the Shader Function The Object ID render pass will write the currently rendered model’s object ID to a texture. You don’t need any of the vertex information in the fragment function. ➤ In Shaders, create a new Metal File named ObjectId.metal and add: #import "Common.h" // 1 struct FragmentOut { uint objectId [[color(0)]]; }; // 2 fragment FragmentOut fragment_objectId( constant Params ¶ms [[buffer(ParamsBuffer)]]) { // 3 FragmentOut out { .objectId = params.objectId }; return out; }
raywenderlich.com
309
Metal by Tutorials
Chapter 12: Render Passes
Going through this code: 1. You create a structure that matches the render pass descriptor color attachment. Color attachment 0 contains the object ID texture. 2. The fragment function takes in params, of which you only need the object ID. 3. You create a FragmentOut instance and write the current object ID to it. You then return it from the fragment function, and the GPU writes the fragment into the given texture. ➤ Build and run the app. You won’t see a difference in your render. Currently, you’re not passing on the object ID texture to the second render pass.
No difference to the render ➤ Capture the GPU workload by clicking the Metal icon and clicking Capture in the popup.
The GPU workload capture icon
raywenderlich.com
310
Metal by Tutorials
Chapter 12: Render Passes
➤ Click the command buffer, and you’ll see two render passes. The Object ID render pass is on the left with an R32Uint pixel format texture. The usual forward render pass is on the right and has a color texture and a depth texture.
The GPU workload capture Even though the store action of the Object ID’s texture is Store, you aren’t using it in the following render pass. This causes a purple exclamation mark warning. ➤ Double-click the red and black texture. This is idTexture.
raywenderlich.com
311
Metal by Tutorials
Chapter 12: Render Passes
Under the texture, click the target icon to show a magnifier, and move it around the texture. As you drag the magnifier over objects, it shows the value of each fragment. This should show you the object ID, but almost all of it shows an object ID of zero. The ground, which has an object ID of zero, renders on top of the other objects.
ID texture with erroneous object ID To get the correct object ID, it’s important to discard models’ fragments that are behind other models. For this reason, you’ll need to render with a depth texture.
Adding the Depth Attachment ➤ Open ObjectIdRenderPass.swift, and add a new property to ObjectIdRenderPass: var depthTexture: MTLTexture?
So far, you’ve used the current drawable’s default depth texture. Next, you’ll create a depth texture that you’ll maintain.
raywenderlich.com
312
Metal by Tutorials
Chapter 12: Render Passes
➤ Add this code to resize(view:size): depthTexture = Self.makeTexture( size: size, pixelFormat: .depth32Float, label: "ID Depth Texture")
Here, you create the depth texture with the correct size and pixel format. This pixel format must match the render pipeline state depth texture format. ➤ Open Pipelines.swift, and in createObjectIdPSO(), change pipelineDescriptor.depthAttachmentPixelFormat = .invalid to: pipelineDescriptor.depthAttachmentPixelFormat = .depth32Float
Now the pixel formats will match. ➤ Go back to ObjectIdRenderPass.swift. In draw(commandBuffer:scene:uniforms:params:), after setting the color
attachment texture, add: descriptor.depthAttachment.texture = depthTexture
You created and stored a depth texture. If you were to build and run now and capture the GPU workload, you’d see the depth texture, but you haven’t completed setting up the GPU’s depth rendering yet.
The Depth Stencil State ➤ Create a new property in ObjectIdRenderPass: var depthStencilState: MTLDepthStencilState?
➤ Add this code to the end of init(): depthStencilState = Self.buildDepthStencilState()
You set up a depth stencil state object with the usual depth rendering. ➤ In draw(commandBuffer:scene:uniforms:params:), add the following code after setting the render pipeline state: renderEncoder.setDepthStencilState(depthStencilState)
raywenderlich.com
313
Metal by Tutorials
Chapter 12: Render Passes
Here, you let the GPU know about the depth setting you want to render with. ➤ Build and run the app. Capture the GPU workload and take a look at your color texture now. Now when you run the magnifier over each object, you’ll clearly see the object IDs.
ID texture with Object IDs You may see some random pixels at the top of the render. The render pass load action for the texture is dontCare, so wherever you’re not rendering an object, the pixels will be random. You’ll need to clear the texture before you render to know exactly what object ID is in the area you click to select.
Load and store actions
raywenderlich.com
314
Metal by Tutorials
Chapter 12: Render Passes
Note: In reality, it doesn’t matter whether you clear on load in this example. As you’ll see shortly, the change of color of each fragment on a picked object will only occur during the fragment function. Since the non-rendered pixels at the top of the screen aren’t being processed through a fragment function, a change of color will never happen. However, it’s good practice to know what’s happening in your textures. At some point, you might decide to pass back the texture to the CPU for further processing.
Load & Store Actions You set up load and store actions in the render pass descriptor attachments. Only set the store action to store if you need the attachment texture down the line. Additionally, you should only clear the texture if you need to. If your fragment function writes to every fragment that appears on-screen, you generally don’t need to clear. For example, you don’t need to clear if you render a full-screen quad. ➤ Open ObjectIdRenderPass.swift. In draw(commandBuffer:scene:uniforms:params:), after setting descriptor.colorAttachments[0].texture, add: descriptor.colorAttachments[0].loadAction = .clear descriptor.colorAttachments[0].storeAction = .store
The load action can be clear, load or dontCare. The most common store actions are store or dontCare.
raywenderlich.com
315
Metal by Tutorials
Chapter 12: Render Passes
➤ Build and run the app, and capture the GPU workload again. Recheck the object ID texture.
No random pixels The pixels at the top of the screen are now cleared with zeros. If you want a non-zero clear value, set colorAttachments[0].clearColor.
Reading the Object ID Texture You now have a choice. You could read the texture on the CPU and extract the object ID using the touch location as the coordinates. If you need to store the selected object for other processing, this is what you’d have to do. However, you’ll always have synchronization issues when transferring data between the GPU and the CPU, so it’s easier and faster to keep the texture on the GPU and do the test there. ➤ Open ForwardRenderPass.swift, and add this new property to ForwardRenderPass: weak var idTexture: MTLTexture? idTexture will hold the ID texture from the object ID render pass.
raywenderlich.com
316
Metal by Tutorials
Chapter 12: Render Passes
➤ Open Renderer.swift. In draw(scene:in:), add this code after objectIdRenderPass.draw(...): forwardRenderPass.idTexture = objectIdRenderPass.idTexture
You pass the ID texture from one render pass to the next. ➤ Open ForwardRenderPass.swift. In draw(commandBuffer:scene:uniforms:params:), before the for render loop, add: renderEncoder.setFragmentTexture(idTexture, index: 11)
You pass idTexture to the forward render pass’s fragment function. Be careful with your index numbers. You may want to rename this one as you did with earlier indices. You’ll also need to send the touch location to the fragment shader so you can use it to read the ID texture. ➤ After the previous code, add: let input = InputController.shared var params = params params.touchX = UInt32(input.touchLocation?.x ?? 0) params.touchY = UInt32(input.touchLocation?.y ?? 0) input.touchLocation is the last location touched on the metal view. The SwiftUI gesture updates it on MetalView.
➤ Open PBR.metal. This is the complex PBR shader you used in the previous chapter. ➤ Add this code to the parameters of fragment_PBR: texture2d idTexture [[texture(11)]]
Be mindful of the type of texture you pass and the index number. ➤ After the conditional that sets material.baseColor, add: if (!is_null_texture(idTexture)) { uint2 coord = uint2(params.touchX * 2, params.touchY * 2); uint objectID = idTexture.read(coord).r; if (params.objectId != 0 && objectID == params.objectId) { material.baseColor = float3(0.9, 0.5, 0); } }
raywenderlich.com
317
Metal by Tutorials
Chapter 12: Render Passes
Here, you read idTexture using the passed-in touch coordinates. This code works because you ensured that idTexture is the same size as the screen. Sometimes it’s worthwhile to halve the texture size to save resources. You could certainly do that here, as long as you remember to halve the coordinates when reading the texture in the fragment function. Notice that read differs from sample. read uses pixel coordinates rather than normalized coordinates. You don’t need a sampler to read a texture, but you also can’t use the various sampler options when you use read. If the currently rendered object ID isn’t zero and the object ID matches the fragment in idTexture, change the material’s base color to orange. ➤ Build and run the app, and click any of the objects.
Selected train turns orange The object you picked will turn orange. When you click the sky or the ground, none of the objects are picked. This is an easy way to test whether an object is picked. It’s also a good way to learn simple render pass texture chaining. However, in most circumstances, you’ll need to pass back the texture to the CPU, so it’s more efficient to perform ray picking as described at the beginning of the chapter.
raywenderlich.com
318
Metal by Tutorials
Chapter 12: Render Passes
➤ With the app running, capture the GPU workload, and click the command buffer.
The completed render passes The frame graph reflects that you’re now using idTexture in your PBR fragment function. The GPU points out any missing textures and redundant binding errors where you bind an already bound texture. Now that you know how to render textures in different render passes, you can move on to more complex rendering and add some shadows in the next chapter. raywenderlich.com
319
Metal by Tutorials
Chapter 12: Render Passes
Key Points • A render pass descriptor describes all of the textures and load and store actions needed by a render pass. • Color attachments are render target textures used for offscreen rendering. • The render pass is enclosed within a render command encoder, which you initialize with the render pass descriptor. • You set a pipeline state object on the render command encoder. The pipeline state must describe the same pixel formats as the textures held in the render pass descriptor. If there is no texture, the pixel format must be invalid. • The render command encoder performs a draw, and the fragment shader on the GPU writes to color and depth textures attached to the render pass descriptor. • Color attachments don’t have to be rgb colors. Instead, you can write uint or float values in the fragment function. • For each texture, you describe load and store actions. If you aren’t using a texture in a later render pass, the action should be dontCare so the GPU can discard it and free up memory. • The GPU workload capture shows you a frame graph where you can see how all your render passes chain together.
raywenderlich.com
320
13
Chapter 13: Shadows
In this chapter, you’ll learn about shadows. A shadow represents the absence of light on a surface. You see shadows on an object when another surface or object obscures it from light. Adding shadows in a project makes your scene look more realistic and provides a feeling of depth.
raywenderlich.com
321
Metal by Tutorials
Chapter 13: Shadows
Shadow Maps Shadow maps are textures containing a scene’s shadow information. When light shines on an object, it casts a shadow on anything behind it. Typically, you render the scene from your camera’s location. However, to build a shadow map, you need to render your scene from the light source’s location - in this case, the sun.
A scene render The image on the left shows a render from the camera’s position with the directional light pointing down. The image on the right shows a render from the directional light’s position. The eye shows the camera’s position in the first image. You’ll do two render passes:
Two render passes are needed
raywenderlich.com
322
Metal by Tutorials
Chapter 13: Shadows
• First pass: You’ll render from the light’s point of view. Since the sun is directional, you’ll use an orthographic camera rather than a perspective camera. You’re only interested in the depth of objects that the sun can see, so you won’t render a color texture. In this pass, you’ll only render the shadow map as a depth texture. This is a grayscale texture, with the gray value indicating depth. Black is close to the light, and white is farther away. • Second pass: You’ll render using the scene camera as usual, but you’ll compare the camera fragment with each shadow map fragment. If the camera fragment’s depth is less than the shadow map fragment at that position, the fragment is in the shadow. The light can see the blue x in the above image, so it isn’t in shadow. Why would you need two passes here? In this case, you’ll render the shadow map from the light’s position, not from the camera’s position. You’ll save the output to a shadow texture and give it to the next render pass, which combines the shadow with the rest of the scene to make a final image.
The Starter Project ➤ In Xcode, open this chapter’s starter project. The code in the starter project is almost identical to the previous chapter but without the object ID and picking code. The scene now has a visible sun that rotates around the scene’s center as the only light. The code for rendering the sun is in the Utility group in DebugModel.swift. You’ll render the sun separately from the scene so that the sun model, itself, isn’t shaded with the rest of the scene. There are a few extra files containing code the app doesn’t need yet. You’ll learn about these files as you proceed through the chapter.
raywenderlich.com
323
Metal by Tutorials
Chapter 13: Shadows
➤ Build and run the app.
The starter app Without shadows, the train and trees in this render appear to float above the ground. The process for adding the new shadow pass is similar to adding the previous chapter’s object picking render pass: 1. Create the new render pass structure and configure the render pass descriptor, depth stencil state and pipeline state. 2. Declare and draw the render pass in Renderer. 3. Set up the drawing code in the render pass. 4. Set up an orthographic camera from the light’s position and calculate the necessary matrices. 5. Create the vertex shader function to draw vertices from the light’s position. Although you give the sun a position in this app, as you learned in Chapter 10, “Lighting Fundamentals”, a directional light has a direction rather than a position. So here, you’ll use the sun’s position as a direction. Note: If you want to see directional lines to debug the sun’s direction, as you did in the earlier chapter, add DebugLights.draw(lights: scene.lighting.lights, encoder: renderEncoder, uniforms: uniforms) to ForwardRenderPass before renderEncoder.endEncoding().
Time to add the new shadow pass.
raywenderlich.com
324
Metal by Tutorials
Chapter 13: Shadows
1. Creating the New Render Pass ➤ In the Render Passes group, create a new Swift file named ShadowRenderPass.swift, and replace the code with: import MetalKit struct ShadowRenderPass: RenderPass { let label: String = "Shadow Render Pass" var descriptor: MTLRenderPassDescriptor? = MTLRenderPassDescriptor() var depthStencilState: MTLDepthStencilState? = Self.buildDepthStencilState() var pipelineState: MTLRenderPipelineState var shadowTexture: MTLTexture? mutating func resize(view: MTKView, size: CGSize) { }
}
func draw( commandBuffer: MTLCommandBuffer, scene: GameScene, uniforms: Uniforms, params: Params ) { }
This code creates a render pass that conforms to RenderPass, with a pipeline state and a texture property for the shadow map. ➤ Open Pipelines.swift, and create a new method to create the pipeline state object: static func createShadowPSO() -> MTLRenderPipelineState { let vertexFunction = Renderer.library?.makeFunction(name: "vertex_depth") let pipelineDescriptor = MTLRenderPipelineDescriptor() pipelineDescriptor.vertexFunction = vertexFunction pipelineDescriptor.colorAttachments[0].pixelFormat = .invalid pipelineDescriptor.depthAttachmentPixelFormat = .depth32Float pipelineDescriptor.vertexDescriptor = .defaultLayout return createPSO(descriptor: pipelineDescriptor) }
raywenderlich.com
325
Metal by Tutorials
Chapter 13: Shadows
Here, you create a pipeline state without a color attachment or fragment function. You’re only interested in the depth information for the shadow — not the color information — so you set the color attachment pixel format to invalid. You’ll still render models, so you still need to hold the vertex descriptor and transform all of the models’ vertices in a vertex function. ➤ Open ShadowRenderPass.swift, and create the initializer: init() { pipelineState = PipelineStates.createShadowPSO() shadowTexture = Self.makeTexture( size: CGSize( width: 2048, height: 2048), pixelFormat: .depth32Float, label: "Shadow Depth Texture") }
This code initializes the pipeline state object and builds the depth texture with the required pixel format. Unlike other render passes, where you match the view’s size, shadow maps are usually square to match the light’s cuboid orthographic camera, so you don’t need to resize the texture when the window resizes. The resolution should be as much as your game resources budget allows to produce sharper shadows.
2. Declaring and Drawing the Render Pass ➤ In the Game group, open Renderer.swift, and add the new render pass property to Renderer: var shadowRenderPass: ShadowRenderPass
➤ In init(metalView:options:), initialize the render pass before calling super.init(). shadowRenderPass = ShadowRenderPass()
➤ In mtkView(_:drawableSizeWillChange:), add: shadowRenderPass.resize(view: view, size: size)
At the moment, you’re not resizing any textures in shadowRenderPass, but since you’ve already created resize(view:size:) as a required method to conform to RenderPass, and you may add textures later, you should call the method here.
raywenderlich.com
326
Metal by Tutorials
Chapter 13: Shadows
➤ In draw(scene:in:), after updateUniforms(scene: scene), add: shadowRenderPass.draw( commandBuffer: commandBuffer, scene: scene, uniforms: uniforms, params: params)
This code performs the render pass.
3. Setting up the Render Pass Drawing Code ➤ Open ShadowRenderPass.swift, and add the following code to draw(commandBuffer:scene:uniforms:params:): guard let descriptor = descriptor else { return } descriptor.depthAttachment.texture = shadowTexture descriptor.depthAttachment.loadAction = .clear descriptor.depthAttachment.storeAction = .store guard let renderEncoder = commandBuffer.makeRenderCommandEncoder(descriptor: descriptor) else { return } renderEncoder.label = "Shadow Encoder" renderEncoder.setDepthStencilState(depthStencilState) renderEncoder.setRenderPipelineState(pipelineState) for model in scene.models { renderEncoder.pushDebugGroup(model.name) model.render( encoder: renderEncoder, uniforms: uniforms, params: params) renderEncoder.popDebugGroup() } renderEncoder.endEncoding()
Here, you set the depth attachment texture on the descriptor’s depth attachment. The GPU will clear the texture on loading it and store it so that the following render pass can use it. You then create the render command encoder, using the descriptor, and render the scene as usual. You surround the model render with renderEncoder.pushDebugGroup(_:) and renderEncoder.popDebugGroup(), which gathers the render commands into groups on the GPU workload capture. Now, you can more easily debug what’s happening.
raywenderlich.com
327
Metal by Tutorials
Chapter 13: Shadows
4. Setting up the Light Camera During the shadow pass, you’ll render from the point of view of the sun, so you’ll need a new camera and some new shader matrices. ➤ In Common.h, in the Shaders group, add these properties to Uniforms: matrix_float4x4 shadowProjectionMatrix; matrix_float4x4 shadowViewMatrix;
Here, you hold the projection and view matrices for the sunlight. ➤ Open Renderer.swift, and add a new property to Renderer: var shadowCamera = OrthographicCamera()
Here, you create an orthographic camera. You previously used an orthographic camera in Chapter 9, “Navigating a 3D Scene”, to render the scene from above. Because sunlight is a directional light, this is the correct projection type for shadows caused by sunlight. For example, if you want shadows from a spotlight, you would use a perspective camera with a field of view that matches the spotlight’s cone angle. ➤ Add the following code to the end of updateUniforms(scene:): shadowCamera.viewSize = 16 shadowCamera.far = 16 let sun = scene.lighting.lights[0] shadowCamera.position = sun.position
With this code, you set up the orthographic camera with a cubic view volume of 16 units. ➤ Continue with some more code: uniforms.shadowProjectionMatrix = shadowCamera.projectionMatrix uniforms.shadowViewMatrix = float4x4( eye: sun.position, center: .zero, up: [0, 1, 0])
raywenderlich.com
328
Metal by Tutorials
Chapter 13: Shadows
shadowViewMatrix is a lookAt matrix that ensures the sun is looking at the center of the scene. float4x4(eye:center:up) is defined in MathLibrary.swift. It takes the
camera’s position, the point that the camera should look at, and the camera’s up vector. This matrix rotates the camera to look at the target by providing these parameters. Note: Here’s a useful debugging tip. Temporarily set uniforms.viewMatrix to uniforms.shadowViewMatrix and uniforms.projectionMatrix to uniforms.shadowProjectionMatrix at the end of updateUniforms(scene:). Developers commonly get the shadow matrices wrong, and it’s useful to visualize the scene render through the light.
5. Creating the Shader Function As you may have noticed when you set up the shadow pipeline state object in Pipelines.swift, it references a shader function named vertex_depth, which doesn’t exist yet. ➤ In the Shaders group, using the Metal File template, create a new file named Shadow.metal. Make sure to check both macOS and iOS targets. ➤ Add the following code to the new file: #import "Common.h" struct VertexIn { float4 position [[attribute(0)]]; }; vertex float4 vertex_depth(const VertexIn in [[stage_in]], constant Uniforms &uniforms [[buffer(UniformsBuffer)]]) { matrix_float4x4 mvp = uniforms.shadowProjectionMatrix * uniforms.shadowViewMatrix * uniforms.modelMatrix; return mvp * in.position; }
raywenderlich.com
329
Metal by Tutorials
Chapter 13: Shadows
This code receives a vertex position, transforms it by the light’s projection and view matrices that you set up in Renderer and returns the transformed position. ➤ Build and run the app.
No shadow yet That looks nice, but where’s the shadow? ➤ Capture the GPU workload and examine the frame capture.
GPU frame capture
raywenderlich.com
330
Metal by Tutorials
Chapter 13: Shadows
➤ Double-click the shadow encoder pass texture result to show the texture in the other resource pane.
The shadow pass depth texture This is the scene rendered from the light’s position. You used the shadow pipeline state, which you configured not to have a fragment shader, so the color information isn’t processed here at all — it’s purely depth. Lighter colors are farther away, and darker colors are closer.
The Main Pass Now that you have the shadow map saved to a texture, you just need to send it to the main pass to use the texture in lighting calculations in the fragment function. ➤ Open ForwardRenderPass.swift, and add a new property: weak var shadowTexture: MTLTexture?
➤ In draw(commandBuffer:scene:uniforms:params:), before the model render for loop, add: renderEncoder.setFragmentTexture(shadowTexture, index: 15)
You pass in the shadow texture and send it to the GPU. ➤ Open Renderer.swift, and add this code to draw(scene:in:) before drawing the forward render pass: forwardRenderPass.shadowTexture = shadowRenderPass.shadowTexture
raywenderlich.com
331
Metal by Tutorials
Chapter 13: Shadows
You pass the shadow texture from the previous shadow pass to the forward render pass. ➤ In the Shaders group, open Vertex.h, and add a new member to VertexOut: float4 shadowPosition;
This holds the vertex position transformed by the shadow matrices. ➤ Open Shaders.metal, and add this line in vertex_main, when creating out: .shadowPosition = uniforms.shadowProjectionMatrix * uniforms.shadowViewMatrix * uniforms.modelMatrix * in.position
You hold two transformed positions for each vertex. One transformed within the scene from the camera’s point of view, and the other from the light’s point of view. You’ll be able to compare the shadow position with the fragment from the shadow map. ➤ Open PBR.metal. This file is where the lighting happens, so the rest of the shadow work will occur in fragment_PBR. ➤ First, add one more function parameter after aoTexture: depth2d shadowTexture [[texture(15)]]
Unlike the textures you’ve used in the past, which have a type of texture2d, the texture type of a depth texture is depth2d. ➤ At the end of fragment_PBR, before return, add: // shadow calculation // 1 float3 shadowPosition = in.shadowPosition.xyz / in.shadowPosition.w; // 2 float2 xy = shadowPosition.xy; xy = xy * 0.5 + 0.5; xy.y = 1 - xy.y; xy = saturate(xy); // 3 constexpr sampler s( coord::normalized, filter::linear, address::clamp_to_edge, compare_func:: less);
raywenderlich.com
332
Metal by Tutorials
Chapter 13: Shadows
float shadow_sample = shadowTexture.sample(s, xy); // 4 if (shadowPosition.z > shadow_sample) { diffuseColor *= 0.5; }
Here’s a code breakdown: 1. in.shadowPosition represents the vertex’s position from the light’s point of view. The GPU performed a perspective divide before writing the fragment to the shadow texture when you rendered from the light’s point of view. Dividing xyz by w here matches the same perspective division so that you can compare the current sample’s depth value to the one in the shadow texture. 2. Determine a coordinate pair from the shadow position to serve as a screen space pixel locator on the shadow texture. Then, you rescale the coordinates from [-1, 1] to [0, 1] to match the uv space. Finally, you reverse the Y coordinate since it’s upside down. 3. Create a sampler to use with the shadow texture, and sample the texture at the coordinates you just created. Get the depth value for the currently processed pixel. You create a new sampler, as textureSampler, initialized at the top of the function, repeats the texture if it’s sampled off the edge. Try using textureSampler later to see repeated extra shadows at the back of the scene. 4. You darken the diffuse color for pixels with a depth greater than the shadow value stored in the texture. For example, if shadowPosition.z is 0.5, and shadow_sample from the stored depth texture is 0.2, then from the sun’s point of view, the current fragment is further away than the stored fragment. Since the sun can’t see the fragment, it’s in shadow. ➤ Build and run the app, and you’ll finally see models with shadows.
Shadows added raywenderlich.com
333
Metal by Tutorials
Chapter 13: Shadows
Shadow Acne In the previous image, as the sun rotates, you’ll notice a lot of flickering. This is called shadow acne or surface acne. The surface is self-shadowing because of a lack of float precision where the sampled texel doesn’t match the calculated value. You can mitigate this by adding a bias to the shadow texture, increasing the z value, thereby bringing the stored fragment closer. ➤ Change the conditional test at // 4 above to: if (shadowPosition.z > shadow_sample + 0.001) {
➤ Build and run the app.
Shadows with no acne The surface acne is now gone, and you have clear shadows from the sun as it rotates around the scene.
Identifying Problems Take a look at the previous render, and you’ll see a problem. Actually, there are two problems. A large dark gray area on the plane appears to be in shadow but shouldn’t be.
raywenderlich.com
334
Metal by Tutorials
Chapter 13: Shadows
If you capture the scene, this is the depth texture for that sun position:
Orthographic camera too large The bottom quarter of the image is white, meaning that the depth for that position is at its farthest. The light’s orthographic camera cuts off that part of the plane and causes it to look as if it is in shadow. The second problem is reading the shadow map. ➤ Open PBR.metal. In fragment_PBR, before xy = saturate(xy);, add: if (xy.x < 0.0 || xy.x > 1.0 || xy.y < 0.0 || xy.y > 1.0) { return float4(1, 0, 0, 1); }
The xy texture coordinates should go from 0 to 1 to be on the texture. So if the coordinates are off the texture, you return red. ➤ Build and run the app.
Reading values off the texture Areas in red are off the depth texture.
raywenderlich.com
335
Metal by Tutorials
Chapter 13: Shadows
You can solve these two problems by setting up the light’s orthographic camera to enclose everything the scene camera catches.
Visualizing the Problems In the Utility group, DebugCameraFrustum.swift will help you visualize this problem by rendering wireframes for the various camera frustums. When running the app, you can press various keys for debugging purposes: • 1: The front view of the scene. • 2: The default view where the sun rotates around the scene. • 3: Render a wireframe of the scene camera frustum. • 4: Render a wireframe of the light camera frustum. • 5: Render a wireframe of the scene camera’s bounding sphere. This key code is in GameScene’s update(deltaTime:). ➤ Open ForwardRenderPass.swift, and add this to the end of draw(commandBuffer:scene:uniforms:params:), before renderEncoder.endEncoding(): DebugCameraFrustum.draw( encoder: renderEncoder, scene: scene, uniforms: uniforms)
This code sets up the debug code so that the above keypresses will work. ➤ Open GameScene.swift, and at the top of init(), add: camera.far = 5
The default for the camera’s far plane is 100, and it’s difficult to visualize. A far of 5 is quite close and easy to visualize but will temporarily cut off a lot of the scene.
raywenderlich.com
336
Metal by Tutorials
Chapter 13: Shadows
➤ Build and run the app.
Some of the scene is missing. You can see that much of the scene is missing due to the closer far plane. ➤ Press the number 3 key on the keyboard above the letters. Here, you pause the sun’s rotation and create a new perspective arcball camera that looks down on the scene from afar and renders the original perspective scene camera frustum in blue wireframe. Using the mouse or trackpad, drag the scene to rotate and examine it. You’ll see that the third tree lies outside the blue frustum and therefore isn’t rendered. You’ll also see where the shadow texture covers the scene. Red areas lie outside the shadow texture.
The scene camera frustum
raywenderlich.com
337
Metal by Tutorials
Chapter 13: Shadows
➤ Press the number 4 key. The orthographic light’s camera view volume wireframe shows in yellow.
The light view volume One edge of the light’s view volume is from the corners of the gray plane. That’s where the light’s frustum doesn’t reach and shows up on the shadow map texture as white. ➤ In GameScene, change camera.far = 5 to: camera.far = 10
➤ Build and run the app. When you see a patch of red plane, press the 3 key then the 4 key.
Understanding why the scene captures area off texture Rotating the scene, you’ll see that the blue wireframe extends into the red area. The yellow wireframe should enclose that area but currently doesn’t.
raywenderlich.com
338
Metal by Tutorials
Chapter 13: Shadows
➤ Press the number 5 key. This shows a white bounding sphere that encloses the scene camera’s frustum.
The scene camera frustum's bounding sphere The light volume should enclose the white bounding sphere to get the best shadows.
Solving the Problems ➤ In the Game group, open ShadowCamera.swift. This file contains various methods to calculate the corners of the camera frustum. createShadowCamera(using:lightPosition:) creates an orthographic camera that encloses the specified camera. ➤ Open Renderer.swift. In updateUniforms(scene:), replace all of the shadow code from shadowCamera.viewSize = 16 to the end of the method with: let sun = scene.lighting.lights[0] shadowCamera = OrthographicCamera.createShadowCamera( using: scene.camera, lightPosition: sun.position) uniforms.shadowProjectionMatrix = shadowCamera.projectionMatrix uniforms.shadowViewMatrix = float4x4( eye: shadowCamera.position, center: shadowCamera.center, up: [0, 1, 0])
Here, you create an orthographic light camera with a view volume that completely envelopes the scene.camera’s frustum.
raywenderlich.com
339
Metal by Tutorials
Chapter 13: Shadows
➤ Build and run the app.
Light view volume encloses scene camera frustum Now the light camera volume encloses the whole scene so that you won’t see any red errors or erroneous gray patches. ➤ Open GameScene.swift. In init(), remove: camera.far = 10
You remove the assignment to camera.far, which restores the default camera.far to 100. ➤ Build and run the app. Due to the huge light view volume, the shadows are very blocky. The image below shows the rendered shadow texture on the right. You can see almost no details.
Blocky shadows when the light volume is too large
raywenderlich.com
340
Metal by Tutorials
Chapter 13: Shadows
You can change far back to 5 or 20 and capture the GPU workload to compare shadow texture quality. This is one situation where you, as the game designer, would have to decide on the shadow quality. The best outcome would be to use the far value of 5 on shadows closer to the camera and a far value of 20 for shadows farther away, as the resolution won’t matter so much.
Cascaded Shadow Mapping Modern games use a technique known as cascaded shadow maps to help balance performance and shadow depth. In Chapter 8, “Textures”, you learned about mip maps, textures of varying sizes used by the GPU depending on the distance from the camera. Cascaded shadow maps employ a similar idea. With cascaded shadow maps, you render the scene to several shadow maps in a depth texture array using different near and far planes. As you’ve seen, the smaller far value creates a smaller light volume, which produces a more detailed shadow map. You sample the shadow from the shadow map with the smaller light volume for the fragments closer to the scene camera. Farther away, you don’t need as much accuracy, so you can sample the shadow from the larger light frustum that takes in more of the scene. The downside is that you have to render the scene multiple times, once for each shadow map. Shadows can take a lot of calculation and processing time. You have to decide how much of your frame time allowance to give them. In the resources folder for this chapter, references.markdown contains some articles about common techniques to improve your shadows.
raywenderlich.com
341
Metal by Tutorials
Chapter 13: Shadows
Key Points • A shadow map is a render taken from the point of the light casting the shadow. • You capture a depth map from the perspective of the light in a first render pass. • A second render pass then compares the depth of the rendered fragment with the stored depth map fragment. If the fragment is in shadow, you shade the diffuse color accordingly. • The best shadows are where the light view volume exactly encases the scene camera’s frustum. However, you have to know how much of the scene is being captured. If the area is large, shadows will be blocky. • Shadows are expensive. A lot of research has gone into rendering shadows, and there are many different methods of improvements and techniques. Cascaded shadow mapping is the most common modern technique.
raywenderlich.com
342
14
Chapter 14: Deferred Rendering
Up to now, your lighting model has used a simple technique called forward rendering. With traditional forward rendering, you draw each model in turn. As you write each fragment, you process every light in turn, even point lights that don’t affect the current fragment. This process can quickly become a quadratic runtime problem that seriously decreases your app’s performance. Assume you have a hundred models and a hundred lights in the scene. Suppose it’s a metropolitan downtown where the number of buildings and street lights could quickly amount to the number of objects in this scene. At this point, you’d be looking for an alternative rendering technique. Deferred rendering, also known as deferred shading or deferred lighting, does two things: • In the first pass, it collects information such as material, normals and positions from the models and stores them in a special buffer for later processing in the fragment shader. Unnecessary calculations don’t occur in this first pass. The special buffer is named the G-buffer, where G is for Geometry. • In the second pass, it processes all lights in a fragment shader, but only where the light affects the fragment.
raywenderlich.com
343
Metal by Tutorials
Chapter 14: Deferred Rendering
This approach takes the quadratic runtime down to linear runtime since the lights’ processing loop is only performed once and not once for each model. Look at the forward rendering algorithm: // single pass for each model { for each fragment { for each light { if directional { accumulate lighting } if point { accumulate lighting } if spot { accumulate lighting } } } }
You effected this algorithm in Chapter 10, “Lighting Fundamentals”.
Point lights affecting fragments In forward rendering, you process both lights for the magnified fragments in the image above even though the blue light on the right won’t affect them. Now, compare it to the deferred rendering algorithm: // pass 1 - g-buffer capture for each model { for each fragment { capture color, position, normal and shadow } } // pass 2 - light accumulation render a quad for each fragment { accumulate directional light } render geometry for point light volumes for each fragment { accumulate point light }
raywenderlich.com
344
Metal by Tutorials
Chapter 14: Deferred Rendering
render geometry for spot light volumes for each fragment { accumulate spot light }
Four textures comprise the G-buffer While you have more render passes with deferred rendering, you process fewer lights. All fragments process the directional light, which shades the albedo along with adding the shadow from the directional light. But for the point light, you render special geometry that only covers the area the point light affects. The GPU will process only the affected fragments. Here are the steps you’ll take throughout this chapter: • The first pass renders the shadow map. You’ve already done this. • The second pass constructs G-buffer textures containing these values: material color (or albedo) with shadow information, world space normals and positions. • Using a full-screen quad, the third and final pass processes the directional light. The same pass then renders point light volumes and accumulates point light information. If you have spotlights, you would repeat this process. Note: Apple GPUs can combine the second and third passes. Chapter 15, “TileBased Deferred Rendering”, will revise this chapter’s project to take advantage of this feature.
raywenderlich.com
345
Metal by Tutorials
Chapter 14: Deferred Rendering
The Starter Project ➤ In Xcode, open the starter project for this chapter. The project is almost the same as the end of the previous chapter, with some refactoring and reorganization. There’s new lighting, with extra point lights. The camera and light debugging features from the previous chapter are gone. Take note of the following additions: • In the Game group, in SceneLighting.swift, createPointLights(count:min:max:) creates multiple point lights. • Since you’ll deal with many lights, the light buffer is greater than 4k. This means that you won’t be able to use setFragmentBytes(_:length:index:). Instead, scene lighting is now split out into three light buffers: one for sunlight, one for point lights and one that contains both sun and point lights, so that forward rendering still works as it did before. Spotlighting isn’t implemented here. • In the Render Passes group, GBufferRenderPass.swift is a copy of ForwardRenderPass.swift and is already set up in Renderer. You’ll work on this render pass and change it to suit deferred rendering. • In the app, a radio button below the metal view gives you the option to switch between render pass types. There won’t be any difference in the render at this point. • For simplicity, the renderer returns to phong shading rather than processing textures for PBR. • In the Shaders group, in Lighting.metal, phongLighting’s conditional code is refactored into separate functions, one for each lighting method. • icosphere.obj is a new model you’ll use later in the chapter.
raywenderlich.com
346
Metal by Tutorials
Chapter 14: Deferred Rendering
➤ Build and run the app, and ensure that you know how all of the code fits together.
The starter app The twenty point lights are random, so your render may look slightly different. Note: To visualize where the point lights are, uncomment the DebugLights draw at the end of ForwardRenderPass.swift. You’ll see the point light positions when you choose the Forward option in the app.
The G-buffer Pass All right, time to build up that G-buffer! ➤ In the Render Passes group, open GBufferRenderPass.swift, and add four new texture properties to GBufferRenderPass: var var var var
albedoTexture: MTLTexture? normalTexture: MTLTexture? positionTexture: MTLTexture? depthTexture: MTLTexture?
These are the textures the G-buffer requires.
raywenderlich.com
347
Metal by Tutorials
Chapter 14: Deferred Rendering
➤ Add this to resize(view:size:): albedoTexture = Self.makeTexture( size: size, pixelFormat: .bgra8Unorm, label: "Albedo Texture") normalTexture = Self.makeTexture( size: size, pixelFormat: .rgba16Float, label: "Normal Texture") positionTexture = Self.makeTexture( size: size, pixelFormat: .rgba16Float, label: "Position Texture") depthTexture = Self.makeTexture( size: size, pixelFormat: .depth32Float, label: "Depth Texture")
Here, you create the four textures with the desired pixel formats. bgra8Unorm has the format of four 8-bit unsigned components, which store integer values between 0 and 255. However, you’ll need to store the position and normal values in higher precision than the color values by using rgba16Float. ➤ In the Shaders group, open Common.h, and add a new enumeration for the extra texture indices: typedef enum { RenderTargetAlbedo = 1, RenderTargetNormal = 2, RenderTargetPosition = 3 } RenderTargetIndices;
These are the names for the render target texture indices. In the Geometry group, open VertexDescriptor.swift, and add the syntactic sugar extension: extension RenderTargetIndices { var index: Int { return Int(rawValue) } }
Using values from RenderTargetIndices will now be easier to read.
raywenderlich.com
348
Metal by Tutorials
Chapter 14: Deferred Rendering
➤ Open Pipelines.swift. You’ll create all the pipeline states here. You can compare what each pipeline state requires as you progress through the chapter. Currently, createGBufferPSO(colorPixelFormat:) and createForwardPSO(colorPixelFormat:) are the same, but you’ll need to change the G-buffer pipeline state object to specify the different texture formats. ➤ At the end of the file, create a new extension: extension MTLRenderPipelineDescriptor { func setGBufferPixelFormats() { colorAttachments[RenderTargetAlbedo.index] .pixelFormat = .bgra8Unorm colorAttachments[RenderTargetNormal.index] .pixelFormat = .rgba16Float colorAttachments[RenderTargetPosition.index] .pixelFormat = .rgba16Float } }
These color attachment pixel formats are the same as the ones you used for the albedo, normal and position textures. ➤ In createGBufferPSO(colorPixelFormat:), replace: pipelineDescriptor.colorAttachments[0].pixelFormat = colorPixelFormat
➤ With: pipelineDescriptor.colorAttachments[0].pixelFormat = .invalid pipelineDescriptor.setGBufferPixelFormats()
This sets the three color attachment pixel formats. Notice that you’re not using colorAttachments[0] any more, as RenderTargetIndices starts at 1. You could use 0 for the albedo since you’re not using the drawable in this pass, but you’ll combine passes in the next chapter, so you leave color attachment 0 available for this event. For the vertex function, you can reuse vertex_main from the main render pass, as all this does is transform the positions and normals. However, you’ll need a new fragment function that stores the position and normal data into textures and doesn’t process the lighting.
raywenderlich.com
349
Metal by Tutorials
Chapter 14: Deferred Rendering
➤ Still in createGBufferPSO(colorPixelFormat:), replace "fragment_main" with: "fragment_gBuffer"
That completes the pipeline state object setup. Next, you’ll deal with the render pass descriptor. ➤ Open GBufferRenderPass.swift, and add this code to the bottom of init(view:): descriptor = MTLRenderPassDescriptor()
Here, you create a new render pass descriptor instead of using the view’s automatically-generated render pass descriptor. ➤ At the top of draw(commandBuffer:scene:uniforms:params:), add: let textures = [ albedoTexture, normalTexture, positionTexture ] for (index, texture) in textures.enumerated() { let attachment = descriptor?.colorAttachments[RenderTargetAlbedo.index + index] attachment?.texture = texture attachment?.loadAction = .clear attachment?.storeAction = .store attachment?.clearColor = MTLClearColor(red: 0.73, green: 0.92, blue: 1, alpha: 1) } descriptor?.depthAttachment.texture = depthTexture descriptor?.depthAttachment.storeAction = .dontCare
You iterate through each of the three textures that you’ll write in the fragment function and add them to the render pass descriptor’s color attachments. If the load action is clear when you add the color attachment, you can set the clear color. The scene depicts a sunny day with sharp shadows, so you set the color to sky blue. store ensures that the color textures don’t clear before the next render pass. However, you won’t need the depth attachment after this render pass, so you set this store action to dontCare. ➤ Still in draw(commandBuffer:scene:uniforms:params:), remove: renderEncoder.setFragmentBuffer(
raywenderlich.com
350
Metal by Tutorials
Chapter 14: Deferred Rendering
scene.lighting.lightsBuffer, offset: 0, index: LightBuffer.index)
In the initial pass, you only store the albedo, or base color, and mark fragments as shadowed or not. You don’t need the light buffer because you previously processed the shadow matrices in the shadow render pass. Currently, you send the view’s render pass descriptor to GBufferRenderPass. However, you must change this since you created a new one. ➤ Open Renderer.swift. In draw(scene:in:), remove: gBufferRenderPass.descriptor = descriptor
Before you test all of this code, you must create the new fragment shader. ➤ In the Shaders group, create a new Metal File named Deferred.metal. Add this code to the new file: #import "Vertex.h" #import "Lighting.h" fragment float4 fragment_gBuffer( VertexOut in [[stage_in]], depth2d shadowTexture [[texture(ShadowTexture)]], constant Material &material [[buffer(MaterialBuffer)]]) { return float4(material.baseColor, 1); }
Here, you take in the results of the vertex function, the shadow texture from the shadow render pass, and the object’s material. You return the base color of the material so that you’ll be able to see something in the render. ➤ Build and run the app.
The current drawable contains randomness
raywenderlich.com
351
Metal by Tutorials
Chapter 14: Deferred Rendering
Currently, you aren’t writing anything to the view’s drawable, only to the G-buffer render pass descriptor textures. So you’ll get something random on your app window. Mine comes out a lovely shade of magenta. ➤ Capture the GPU workload, and click the Command Buffer to see what’s happening there. You’ll probably get an error stating that it can’t harvest a resource, but ignore that for the moment.
Frame capture with G-buffer textures From this result, you can see that you successfully stored the shadow texture from the shadow pass as well as the three color and depth textures from your G-buffer pass, cleared to your sky blue color. ➤ Open Deferred.metal, and add a new structure before fragment_gBuffer: struct GBufferOut { float4 albedo [[color(RenderTargetAlbedo)]]; float4 normal [[color(RenderTargetNormal)]]; float4 position [[color(RenderTargetPosition)]]; };
These correspond to the pipeline states and render pass descriptor color attachment textures. ➤ Replace fragment_gBuffer() with: // 1 fragment GBufferOut fragment_gBuffer( VertexOut in [[stage_in]], depth2d shadowTexture [[texture(ShadowTexture)]],
raywenderlich.com
352
Metal by Tutorials
{
Chapter 14: Deferred Rendering
constant Material &material [[buffer(MaterialBuffer)]])
GBufferOut out; // 2 out.albedo = float4(material.baseColor, 1.0); // 3 out.albedo.a = calculateShadow(in.shadowPosition, shadowTexture); // 4 out.normal = float4(normalize(in.worldNormal), 1.0); out.position = float4(in.worldPosition, 1.0); return out; }
Here, you 1. Return GBufferOut from the fragment function instead of only a single color value. 2. Set the albedo texture to the material’s base color. 3. Calculate whether the fragment is in shadow, using the shadow position and the shadow texture. The shadow value is a single float. Since you don’t use the alpha channel of the albedo texture, you can store the shadow value there. 4. Write the normal and position values into the corresponding texture. ➤ Build and run the app, and capture the GPU workload.
G-buffer textures containing data fragment_gBuffer now writes to your three color textures.
raywenderlich.com
353
Metal by Tutorials
Chapter 14: Deferred Rendering
The Lighting Pass Up to this point, you rendered the scene to multiple render targets, saving them for later use in the fragment shader. By rendering a full-screen quad, you can cover every pixel on the screen. This lets you process each fragment from your three textures and calculate lighting for each fragment. The results of this composition pass will end up in the view’s drawable. ➤ Create a new Swift file named LightingRenderPass in the Render Passes group. Replace the contents with: import MetalKit struct LightingRenderPass: RenderPass { let label = "Lighting Render Pass" var descriptor: MTLRenderPassDescriptor? var sunLightPSO: MTLRenderPipelineState let depthStencilState: MTLDepthStencilState? weak var albedoTexture: MTLTexture? weak var normalTexture: MTLTexture? weak var positionTexture: MTLTexture? func resize(view: MTKView, size: CGSize) {}
}
func draw( commandBuffer: MTLCommandBuffer, scene: GameScene, uniforms: Uniforms, params: Params ) { }
With this code, you add the necessary conformance to RenderPass and the texture properties you need for this consolidation pass. You’ll accumulate output from all light types when running the lighting pass. Each type of light needs a different fragment function, so you’ll need multiple pipeline states. First, you’ll create a pipeline state object for rendering the sun’s directional light and return later and add a point light pipeline state object. ➤ Open Pipelines.swift, and copy createForwardPSO(colorPixelFormat:) to a new method named createSunLightPSO(colorPixelFormat:). Instead of rendering models, you’ll render a quad for the lighting pass. You can define the vertices on the GPU and create a simple vertex function.
raywenderlich.com
354
Metal by Tutorials
Chapter 14: Deferred Rendering
➤ In createSunLightPSO(colorPixelFormat:), replace "vertex_main" with: "vertex_quad"
This vertex function is responsible for positioning the quad vertices. ➤ Replace "fragment_main" with: "fragment_deferredSun"
➤ Remove: pipelineDescriptor.vertexDescriptor = MTLVertexDescriptor.defaultLayout
For this quad, you don’t need the vertex descriptor. If you don’t remove it, the GPU expects many buffers at the various bindings as described by the default vertex descriptor in VertexDescriptor.swift. ➤ Open LightingRenderPass.swift, and add the initializer to LightingRenderPass: init(view: MTKView) { sunLightPSO = PipelineStates.createSunLightPSO( colorPixelFormat: view.colorPixelFormat) depthStencilState = Self.buildDepthStencilState() }
Here, you initialize the pipeline state object with your new pipeline state parameters. ➤ In draw(commandBuffer:scene:uniforms:params:), add: guard let descriptor = descriptor, let renderEncoder = commandBuffer.makeRenderCommandEncoder( descriptor: descriptor) else { return } renderEncoder.label = label renderEncoder.setDepthStencilState(depthStencilState) var uniforms = uniforms renderEncoder.setVertexBytes( &uniforms, length: MemoryLayout.stride, index: UniformsBuffer.index)
raywenderlich.com
355
Metal by Tutorials
Chapter 14: Deferred Rendering
Since you’ll draw the quad directly to the screen, you’ll use the view’s current render pass descriptor. You set up the render command encoder as usual with the depth stencil state and vertex uniforms. ➤ After the previous code, add: renderEncoder.setFragmentTexture( albedoTexture, index: BaseColor.index) renderEncoder.setFragmentTexture( normalTexture, index: NormalTexture.index) renderEncoder.setFragmentTexture( positionTexture, index: NormalTexture.index + 1)
This code passes the three attachments from the G-buffer render pass. Note the laziness of adding one to the index for the position. This is a mistake waiting to happen, and you should name the index at a later time. ➤ Create a new method for processing the sun light: func drawSunLight( renderEncoder: MTLRenderCommandEncoder, scene: GameScene, params: Params ) { renderEncoder.pushDebugGroup("Sun Light") renderEncoder.setRenderPipelineState(sunLightPSO) var params = params params.lightCount = UInt32(scene.lighting.sunlights.count) renderEncoder.setFragmentBytes( ¶ms, length: MemoryLayout.stride, index: ParamsBuffer.index) renderEncoder.setFragmentBuffer( scene.lighting.sunBuffer, offset: 0, index: LightBuffer.index) renderEncoder.drawPrimitives( type: .triangle, vertexStart: 0, vertexCount: 6) renderEncoder.popDebugGroup() }
Here, you send the sun light details to the fragment function and draw the six vertices of a quad.
raywenderlich.com
356
Metal by Tutorials
Chapter 14: Deferred Rendering
➤ Call this new method at the end of draw(commandBuffer:scene:uniforms:params:): drawSunLight( renderEncoder: renderEncoder, scene: scene, params: params) renderEncoder.endEncoding()
You call the method and end the render pass encoding.
Updating Renderer You’ll now add the new lighting pass to Renderer and pass in the necessary textures and render pass descriptor. ➤ Open Renderer.swift, and add a new property to Renderer: var lightingRenderPass: LightingRenderPass
➤ Add this line before super.init(): lightingRenderPass = LightingRenderPass(view: metalView)
You initialized the lighting render pass. ➤ Add this line to mtkView(_:drawableSizeWillChange:): lightingRenderPass.resize(view: view, size: size)
You currently don’t resize any textures in LightingRenderPass. However, it’s a good idea to call the resize method in case you add anything later. ➤ In draw(scene:in:), locate if options.renderChoice == .deferred. ➤ At the end of the conditional closure, add: lightingRenderPass.albedoTexture = gBufferRenderPass.albedoTexture lightingRenderPass.normalTexture = gBufferRenderPass.normalTexture lightingRenderPass.positionTexture = gBufferRenderPass.positionTexture lightingRenderPass.descriptor = descriptor lightingRenderPass.draw( commandBuffer: commandBuffer,
raywenderlich.com
357
Metal by Tutorials
Chapter 14: Deferred Rendering
scene: scene, uniforms: uniforms, params: params)
Here, you pass the textures to the lighting pass and set the render pass descriptor. You then process the lighting render pass. You’ve set up everything on the CPU side. Now it’s time to turn to the GPU.
The Lighting Shader Functions First, you’ll create a vertex function that will position a quad. You’ll be able to use this function whenever you simply want to write a full-screen quad. ➤ Open Deferred.metal, and add an array of six vertices for the quad: constant float3 vertices[6] = { float3(-1, 1, 0), // triangle 1 float3( 1, -1, 0), float3(-1, -1, 0), float3(-1, 1, 0), // triangle 2 float3( 1, 1, 0), float3( 1, -1, 0) };
➤ Add the new vertex function: vertex VertexOut vertex_quad(uint vertexID [[vertex_id]]) { VertexOut out { .position = float4(vertices[vertexID], 1) }; return out; }
For each of the six vertices, you return the position in the vertices array. ➤ Add the new fragment function: fragment float4 fragment_deferredSun( VertexOut in [[stage_in]], constant Params ¶ms [[buffer(ParamsBuffer)]], constant Light *lights [[buffer(LightBuffer)]], texture2d albedoTexture [[texture(BaseColor)]], texture2d normalTexture [[texture(NormalTexture)]], texture2d positionTexture [[texture(NormalTexture + 1)]])
raywenderlich.com
358
Metal by Tutorials
{ }
Chapter 14: Deferred Rendering
return float4(1, 0, 0, 1);
These are standard parameters for a fragment function. For now, you return the color red. ➤ Build and run the app, and you’ll see a red screen, which is an excellent result as this is the color you currently return from fragment_deferredSun.
Returning red from the fragment function You can now work out the lighting and make your render a little more exciting. ➤ Replace the contents of fragment_deferredSun with: uint2 coord = uint2(in.position.xy); float4 albedo = albedoTexture.read(coord); float3 normal = normalTexture.read(coord).xyz; float3 position = positionTexture.read(coord).xyz; Material material { .baseColor = albedo.xyz, .specularColor = float3(0), .shininess = 500 }; float3 color = phongLighting( normal, position, params, lights, material); color *= albedo.a; return float4(color, 1);
Since the quad is the same size as the screen, in.position matches the screen position, so you can use it as coordinates for reading the textures. You then call phongLighting with the values you read from the textures. The material values should be captured in the previous G-buffer pass, but they’re added here for brevity. raywenderlich.com
359
Metal by Tutorials
Chapter 14: Deferred Rendering
You stored the shadow in the albedo alpha channel in the G-buffer pass. So, after calculating the phong lighting for the sun and ambient lights, you simply multiply by the alpha channel to get the shadow. ➤ Build and run the app.
Accumulating the directional light and shadows The result should be the same as the forward render pass, except for the point lights. However, you’ll notice that the sky is a different color. The color changed because you calculate the sunlight for every fragment on the screen, even where there’s no original model geometry. In the next chapter, you’ll deal with this problem by using stencil masks. This seems a lot of work for not much gain so far. But now comes the payoff! Instead of rendering just 20 point lights, you’ll render as many as your device can take: in the order of thousands more than the forward renderer.
Adding Point Lights So far, you’ve drawn the plain albedo and shaded it with directional light. You need a second fragment function for calculating point lights. In the original forward pass algorithm, you iterated through all the point lights for every fragment and performed the point light accumulation. Now you’ll render a light volume in the shape of a sphere for every point light. Only the fragments you render for that light volume will require the point light accumulation. The problem comes when one light volume is in front of another. The fragment function result will overwrite any previous result. You’ll overcome this problem by accumulating the result into the final drawable by blending rather than overwriting the fragment. raywenderlich.com
360
Metal by Tutorials
Chapter 14: Deferred Rendering
Blending the light volume Inside your starter project is a model named icosphere.obj. You’ll render one of these for each point light.
Icosphere and UV sphere The icosphere is a low-resolution sphere with only sixty vertices. Compare it to a UV sphere. The icosphere’s faces are more regular and all have a similar area, whereas the UV sphere’s faces are smaller at the two top and bottom poles. This app assumes that all point lights have the same radius attenuation, which fits inside the icosphere. If a point light has a larger radius, the icosphere’s straight edges could cut it off. You could also add more vertices to the icosphere, making it rounder, but that would make the rendering less efficient. ➤ Open LightingRenderPass.swift, and add a new property to LightingRenderPass: var icosphere = Model(name: "icosphere.obj")
You initialize the sphere for later use.
raywenderlich.com
361
Metal by Tutorials
Chapter 14: Deferred Rendering
Now you need a new pipeline state object with new shader functions to render the sphere. ➤ Open Pipelines.swift, and copy createForwardPSO(colorPixelFormat:) to a new method called createPointLightPSO(colorPixelFormat:). ➤ Change the vertex function’s name to "vertex_pointLight" and the fragment function’s name to "fragment_pointLight". Later, you’ll need to add blending to the light accumulation. The pipeline state is how you tell the GPU that you require blending, so shortly. You’ll add this to the pipeline state object. ➤ Open LightingRenderPass.swift, and add a new property for the pipeline state object: var pointLightPSO: MTLRenderPipelineState
➤ Initialize the pipeline state in init(view:): pointLightPSO = PipelineStates.createPointLightPSO( colorPixelFormat: view.colorPixelFormat)
➤ Create a new method to draw the point light volumes: func drawPointLight( renderEncoder: MTLRenderCommandEncoder, scene: GameScene, params: Params ) { renderEncoder.pushDebugGroup("Point lights") renderEncoder.setRenderPipelineState(pointLightPSO) renderEncoder.setVertexBuffer( scene.lighting.pointBuffer, offset: 0, index: LightBuffer.index) renderEncoder.setFragmentBuffer( scene.lighting.pointBuffer, offset: 0, index: LightBuffer.index) }
The vertex function needs the light position to position each icosphere, while the fragment function needs the light attenuation and color. ➤ After the previous code, add: guard let mesh = icosphere.meshes.first,
raywenderlich.com
362
Metal by Tutorials
Chapter 14: Deferred Rendering
let submesh = mesh.submeshes.first else { return } for (index, vertexBuffer) in mesh.vertexBuffers.enumerated() { renderEncoder.setVertexBuffer( vertexBuffer, offset: 0, index: index) }
You set up the vertex buffers with the icosphere’s mesh attributes.
Instancing If you had one thousand point lights, a draw call for each light volume would bring your system to a crawl. Instancing is a great way to tell the GPU to draw the same geometry a specific number of times. The GPU informs the vertex function which instance it’s currently drawing so that you can extract information from arrays containing instance information. In SceneLighting, you have an array of point lights with the position and color. Each of these point lights is an instance. You’ll draw the icosphere mesh for each point light. ➤ After the previous code, add the draw call: renderEncoder.drawIndexedPrimitives( type: .triangle, indexCount: submesh.indexCount, indexType: submesh.indexType, indexBuffer: submesh.indexBuffer, indexBufferOffset: submesh.indexBufferOffset, instanceCount: scene.lighting.pointLights.count) renderEncoder.popDebugGroup()
Adding instanceCount to the draw call means that the GPU will repeat drawing the icosphere’s vertex and submesh information for the specified number of instances. GPU hardware is optimized to do this. ➤ Call this new method from draw(commandBuffer:scene:uniforms:params:) before renderEncoder.endEncoding(): drawPointLight( renderEncoder: renderEncoder, scene: scene, params: params)
raywenderlich.com
363
Metal by Tutorials
Chapter 14: Deferred Rendering
Creating the Point Light Shader Functions ➤ Open Deferred.metal, and add the new structures that the vertex function will need: struct PointLightIn { float4 position [[attribute(Position)]]; }; struct PointLightOut { float4 position [[position]]; uint instanceId [[flat]]; };
You’re only interested in the position, so you only use the vertex descriptor’s position attribute. You send the position to the rasterizer, but you also send the instance ID so that the fragment function can extract the light details from the point lights array. You don’t want any rasterizer interpolation, so you mark the instance ID with the attribute [[flat]]. ➤ Add the new vertex function: vertex PointLightOut vertex_pointLight( PointLightIn in [[stage_in]], constant Uniforms &uniforms [[buffer(UniformsBuffer)]], constant Light *lights [[buffer(LightBuffer)]], // 1 uint instanceId [[instance_id]]) { // 2 float4 lightPosition = float4(lights[instanceId].position, 0); float4 position = uniforms.projectionMatrix * uniforms.viewMatrix // 3 * (in.position + lightPosition); PointLightOut out { .position = position, .instanceId = instanceId }; return out; }
raywenderlich.com
364
Metal by Tutorials
Chapter 14: Deferred Rendering
The points of interest are: 1. Use the attribute [[instance_id]] to detect the current instance. 2. Use the instance ID to index into the lights array. 3. Add the light’s position to the vertex position. Since you’re not dealing with scaling or rotation, you don’t need to multiply by a model matrix. ➤ Add the new fragment function: fragment float4 fragment_pointLight( PointLightOut in [[stage_in]], texture2d normalTexture [[texture(NormalTexture)]], texture2d positionTexture [[texture(NormalTexture + 1)]], constant Light *lights [[buffer(LightBuffer)]]) { Light light = lights[in.instanceId]; uint2 coords = uint2(in.position.xy); float3 normal = normalTexture.read(coords).xyz; float3 position = positionTexture.read(coords).xyz;
}
Material material { .baseColor = 1 }; float3 lighting = calculatePoint(light, position, normal, material); lighting *= 0.5; return float4(lighting, 1);
You extract the light from the lights array using the instance ID sent by the vertex function. Just as you did for the sun light, you read in the textures from the previous render pass and calculate the point lighting. You reduce the intensity by 0.5, as blending will make the lights brighter. Notice the base color is 1. Rather than taking the color from the albedo, this time, you’ll achieve the glowing point light with GPU blending. ➤ Build and run the app. Your render is as before. No glowing point lights here.
raywenderlich.com
365
Metal by Tutorials
Chapter 14: Deferred Rendering
Capture the GPU workload and check out the attachments in the Point lights render encoder section:
Point light volume drawing The icospheres are drawing, but if you check the depth attachment with the magnifier, you’ll see all the values are zero. There’s a problem with the depth: the icospheres are not rendering in front of the quad. ➤ Open LightingRenderPass.swift, and add a new method: static func buildDepthStencilState() -> MTLDepthStencilState? { let descriptor = MTLDepthStencilDescriptor() descriptor.isDepthWriteEnabled = false return Renderer.device.makeDepthStencilState(descriptor: descriptor) }
raywenderlich.com
366
Metal by Tutorials
Chapter 14: Deferred Rendering
This code overrides the default RenderPass protocol method. You disable depth writes because you always want the icospheres to render. ➤ Build and run the app.
Rendering icospheres Your icosphere volumes now render in front of the quad.
Blending ➤ Open Pipelines.swift. In createPointLightPSO(colorPixelFormat:), add this code before return: let attachment = pipelineDescriptor.colorAttachments[0] attachment?.isBlendingEnabled = true attachment?.rgbBlendOperation = .add attachment?.alphaBlendOperation = .add attachment?.sourceRGBBlendFactor = .one attachment?.sourceAlphaBlendFactor = .one attachment?.destinationRGBBlendFactor = .one attachment?.destinationAlphaBlendFactor = .zero attachment?.sourceRGBBlendFactor = .one attachment?.sourceAlphaBlendFactor = .one
Here, you specify that you enabled blending. You shouldn’t have this on by default because blending is an expensive operation. The other properties determine how to combine the source and destination fragments. All these blending properties are at their defaults, except for destinationRGBBlendFactor. They’re written out here in full to show what you can change. The important change is destinationRGBBlendFactor from zero to one, so blending will occur.
raywenderlich.com
367
Metal by Tutorials
Chapter 14: Deferred Rendering
The icospheres will blend with the color already drawn in the background quad. Black will disappear, leaving only the light color. ➤ Build and run the app.
A few point lights rendering Now it’s time to crank up those point lights. Be careful when you switch to the forward renderer. If you have a lot of lights, your system will appear to hang while the forward render pass laboriously calculates the point light effect on each fragment. ➤ Open SceneLighting.swift . In init(), replace: pointLights = Self.createPointLights( count: 20, min: [-3, 0.1, -3], max: [3, 0.3, 3])
➤ With: pointLights = Self.createPointLights( count: 200, min: [-6, 0.1, -6], max: [6, 0.3, 6])
With this code, you create 200 point lights, specifying the minimum and maximum xyz values to constrain the lights to that area.
raywenderlich.com
368
Metal by Tutorials
Chapter 14: Deferred Rendering
➤ Build and run the app.
Two hundred point lights Your point lights blend beautifully and are comparable with the forward renderer. The only problem is the color of the sky, which you’ll fix in the following chapter. ➤ On the Debug navigator, check your FPS for both Deferred and Forward. ➤ Continue gradually increasing count in the previous code until your forward pass FPS decreases below 60 FPS. Make a note of the number of lights for comparison. ➤ Increase count and check how many point lights your deferred render can manage before degrading below 60 FPS. Many lights are so bright that you may have to dial down the light color in createPointLights(count:min:max:) with light.color *= 0.2.
Render algorithm comparison On my M1 Mac Mini, performance in a small window starts degrading on the forward renderer at about 400 lights, whereas the deferred renderer can cope with 10,000 lights. With the window maximized, forward rendering starts degrading at about 30 to 40 lights, whereas the deferred renderer manages more than 600. On an iPhone 12 Pro, the forward renderer degraded at 100 lights, whereas the deferred renderer could manage 1500 at 60 FPS.
raywenderlich.com
369
Metal by Tutorials
Chapter 14: Deferred Rendering
This chapter has opened your eyes to two rendering techniques: forward and deferred. As the game designer, you get to choose your rendering method. Forward and deferred rendering are just two: Several other techniques can help you get the most out of your frame time. There are also many ways of configuring your forward and deferred render passes. references.markdown in the resources folder for this chapter has a few links for further research. In the next chapter, you’ll learn how to make your deferred render pass even faster by taking advantage of Apple’s new Silicon.
Key Points • Forward rendering processes all lights for all fragments. • Deferred rendering captures albedo, position and normals for later light calculation. For point lights, only the necessary fragments are rendered. • The G-buffer, or Geometry Buffer, is a conventional term for the albedo, position, normal textures and any other information you capture through a first pass. • An icosphere model provides a volume for rendering the shape of a point light. • Using instancing, the GPU can efficiently render the same geometry many times. • The pipeline state object specifies whether the result from the fragment function should be blended with the currently attached texture.
raywenderlich.com
370
15
Chapter 15: Tile-Based Deferred Rendering
Up to this point, you’ve treated the GPU as an immediate mode renderer (IMR) without referring much to Apple-specific hardware. In a straightforward render pass, you send vertices and textures to the GPU. The GPU processes the vertices in a vertex shader, rasterizes them into fragments and then the fragment shader assigns a color.
Immediate mode pipeline
raywenderlich.com
371
Metal by Tutorials
Chapter 15: Tile-Based Deferred Rendering
The GPU uses system memory to transfer resources between passes where you have multiple passes.
Immediate mode using system memory Since the A7 64-bit mobile chip, Apple began transitioning to a tile-based deferred rendering (TBDR) architecture. With the arrival of Apple Silicon on Macs, this transition is complete. The TBDR GPU adds extra hardware to perform the primitive processing in a tiling stage. This process breaks up the screen into tiles and assigns the geometry from the vertex stage to a tile. It then forwards each tile to the rasterizer. Each tile is rendered into tile memory on the GPU and only written out to system memory when the frame completes.
TBDR pipeline
raywenderlich.com
372
Metal by Tutorials
Chapter 15: Tile-Based Deferred Rendering
Programmable Blending Instead of writing the texture in one pass and reading it in the next pass, tile memory enables programmable blending. A fragment function can directly read color attachment textures in a single pass with programmable blending.
Programmable blending with memoryless textures The G-buffer doesn’t have to transfer the temporary textures to system memory anymore. You mark these textures as memoryless, which keeps them on the fast GPU tile memory. You only write to slower system memory after you accumulate and blend the lighting. This speeds up rendering because you use less bandwidth.
Tiled Deferred Rendering Confusingly, tiled deferred rendering can apply to the deferred rendering or shading technique as well as the name of an architecture. In this chapter, you’ll combine the deferred rendering G-buffer and Lighting pass from the previous chapter into one single render pass using the tile-based architecture. To complete this chapter, you need to run the code on a device with an Apple GPU. This device could be an Apple Silicon macOS device or any iOS device running the latest iOS 15. The iOS simulator and Intel Macs can’t handle the code, but the starter project will run Forward Rendering in place of Tiled Deferred Rendering instead of crashing.
raywenderlich.com
373
Metal by Tutorials
Chapter 15: Tile-Based Deferred Rendering
The Starter Project ➤ In Xcode, open the starter project for this chapter. This project is the same as the end of the previous chapter, except: • In the SwiftUI Views group, there’s a new option for tiledDeferred in Options.swift. Renderer will update tiledSupported depending on whether the device supports tiling. • In the Render Passes group, the deferred rendering pipeline state creation methods in Pipelines.swift have an extra Boolean parameter of tiled:. Later, you’ll assign a different fragment function depending on this parameter. • A new file, TiledDeferredRenderPass.swift, combines GBufferRenderPass and LightingRenderPass into one long file. The code is substantially similar, with the two render passes combined into draw(commandBuffer:scene:uniforms:params:). You’ll convert this file from the immediate mode deferred rendering algorithm to tile-based deferred rendering. • Renderer instantiates TiledDeferredRenderPass if the device supports tiling. ➤ Build and run the app on your TBDR device.
The starter app The render is the same as at the end of the previous chapter but with an added Tiled Deferred option under the Metal view.
raywenderlich.com
374
Metal by Tutorials
Chapter 15: Tile-Based Deferred Rendering
Note: If you run the app on a non-TBDR device, the option will be marked Tiled Deferred not Supported!. Apple assigns GPU families to devices. The A7 is iOS GPU family 1. When using newer Metal features, use device.supportsFamily(_:) to check whether the current device supports the capabilities you’re requesting. In init(metalView:options:), Renderer checks the GPU family. If the device supports Apple family 3 GPUs, which Apple introduced with the A9 chip, it supports tile-based deferred rendering. ➤ Capture the GPU workload and refresh your memory on the render pass hierarchy:
GPU frame capture raywenderlich.com
375
Metal by Tutorials
Chapter 15: Tile-Based Deferred Rendering
You have a G-buffer pass where you fill in the albedo, normal and position textures. You also have a Light accumulation pass, where you render a quad and calculate the lighting using the G-buffer textures. In the Render Passes group, open TiledDeferredRenderPass.swift and examine the code. draw(commandBuffer:scene:uniforms:params:) contains both the Gbuffer pass and the Lighting pass. There’s a lot of code, but you should recognize it from the previous chapter. ➤ Currently, this is what happens during your render passes:
Starter app render passes You write the G-buffer textures to system memory and then read them back from system memory. These are the steps you’ll take to move the G-buffer textures to tile memory: 1. Change the texture storage mode from private to memoryless. 2. Change the descriptor’s color attachment store action for all the G-buffer textures to dontCare. 3. In the Lighting pass, stop sending the color attachment textures to the fragment function. 4. Create new fragment shaders for rendering the sun and point lights. 5. Combine the two render encoder passes with their descriptors into one. 6. Update the pipeline state objects to match the new render pass descriptor. As you work through the chapter, you’ll encounter common errors so you can learn how to fix them when you make them in the future. raywenderlich.com
376
Metal by Tutorials
Chapter 15: Tile-Based Deferred Rendering
1. Making the Textures Memoryless ➤ Open TiledDeferredRenderPass.swift. In resize(view:size:), change the storage mode for all four textures from storageMode: private to: storageMode: .memoryless
➤ Build and run the app. You’ll get an error: Memoryless attachment content cannot be stored in memory. You’re still storing the attachment back to system memory. Time to fix that.
2. Changing the Store Action ➤ Stay in TiledDeferredRenderPass.swift. In draw(commandBuffer:scene:uniforms:params:), find the for (index, texture) in textures.enumerated() loop and change attachment?.storeAction = .store to: attachment?.storeAction = .dontCare
This line stops the textures from transferring to system memory. ➤ Build and run the app. You’ll get another error: failed assertion `Set Fragment Buffers Validationtexture is Memoryless, and cannot be assigned.`. For the Lighting pass, you send the textures to the fragment shader as texture parameters. However, you can’t do that with memoryless textures because they’re already resident on the GPU. You’ll fix that next.
3. Removing the Fragment Textures ➤ In drawSunLight(renderEncoder:scene:params:), remove: renderEncoder.setFragmentTexture( albedoTexture, index: BaseColor.index) renderEncoder.setFragmentTexture( normalTexture, index: NormalTexture.index) renderEncoder.setFragmentTexture( positionTexture,
raywenderlich.com
377
Metal by Tutorials
Chapter 15: Tile-Based Deferred Rendering
index: NormalTexture.index + 1)
➤ Build and run the app. You’ll probably get a black screen now because your deferred shader functions are expecting textures.
4. Creating the New Fragment Functions ➤ Still in TiledDeferredRenderPass.swift, in init(view:), change the three pipeline state objects’ tiled: false parameters to: tiled: true
➤ Open Pipelines.swift. In createSunLightPSO(colorPixelFormat:tiled:) and createPointLightPSO(colorPixelFormat:tiled:), check which fragment functions you need to create: • fragment_tiled_deferredSun • fragment_tiled_pointLight You can still use the same vertex functions and G-buffer fragment function. ➤ In the Shaders group, open Deferred.metal. ➤ Copy fragment_deferredSun to a new function called fragment_tiled_deferredSun. ➤ In fragment_tiled_deferredSun, since you’re not sending the fragment textures to the fragment function any more, remove the parameters: texture2d albedoTexture [[texture(BaseColor)]], texture2d normalTexture [[texture(NormalTexture)]], texture2d positionTexture [[texture(texture(NormalTexture + 1)]]
➤ Add a new parameter: GBufferOut gBuffer GBufferOut is the structure that refers to the color attachment render target
textures.
raywenderlich.com
378
Metal by Tutorials
Chapter 15: Tile-Based Deferred Rendering
➤ Change: uint2 coord = uint2(in.position.xy); float4 albedo = albedoTexture.read(coord); float3 normal = normalTexture.read(coord).xyz; float3 position = positionTexture.read(coord).xyz;
➤ To: float4 albedo = gBuffer.albedo; float3 normal = gBuffer.normal.xyz; float3 position = gBuffer.position.xyz;
Instead of swapping out to system memory, you now read the color attachment textures in the fast GPU tile memory. In addition to this speed optimization, you directly access the memory rather than reading in a texture. Repeat this process for the point lights. ➤ Copy fragment_pointLight to a new function named fragment_tiled_pointLight
➤ Remove the parameters: texture2d normalTexture [[texture(NormalTexture)]], texture2d positionTexture [[texture(NormalTexture + 1)]],
➤ Add the parameter: GBufferOut gBuffer
➤ Change: uint2 coords = uint2(in.position.xy); float3 normal = normalTexture.read(coords).xyz; float3 position = positionTexture.read(coords).xyz;
➤ To: float3 normal = gBuffer.normal.xyz; float3 position = gBuffer.position.xyz;
➤ Build and run the app. When creating the sun light pipeline state, you now get the error: Shaders reads from a color attachment whose pixel format is MTLPixelFormatInvalid.
raywenderlich.com
379
Metal by Tutorials
Chapter 15: Tile-Based Deferred Rendering
These are the attachments you set up in the two render passes in the previous chapter’s Deferred Rendering:
Render pass descriptor color attachments Currently, when writing the G-buffer in fragment_gBuffer, your render pass descriptor colorAttachments[0] is nil, and your pipeline state colorAttachments[0] pixel format is invalid. However, when calculating directional light, you only set up colorAttachments[0], which is done in Pipelines.swift, in createSunLightPSO(colorPixelFormat:tiled:). When fragment_tiled_deferredSun reads from gBuffer, the other color attachment pixel formats are invalid. Instead of using two render pass descriptors and render command encoders, you’ll configure the view’s current render pass descriptor to use all the color attachments. Then you’ll set up the pipeline state configuration to match.
5. Combining the Two Render Passes ➤ Open TiledDeferredRenderPass.swift. In draw(commandBuffer:scene:uniforms:params:), change let descriptor = MTLRenderPassDescriptor() to: let descriptor = viewCurrentRenderPassDescriptor
You’ll use the view’s current render pass descriptor, passed in from Renderer, to configure your render command encoder.
raywenderlich.com
380
Metal by Tutorials
Chapter 15: Tile-Based Deferred Rendering
➤ Still in draw(commandBuffer:scene:uniforms:params:), remove: renderEncoder.endEncoding() // MARK: Lighting pass // Set up Lighting descriptor guard let renderEncoder = commandBuffer.makeRenderCommandEncoder( descriptor: viewCurrentRenderPassDescriptor) else { return }
Here, you remove the second render command encoder.
6. Updating the Pipeline States ➤ Open Pipelines.swift. Add this code to createSunLightPSO(colorPixelFormat:tiled:) and createPointLightPSO(colorPixelFormat:tiled:) after setting colorAttachments[0].pixelFormat: if tiled { pipelineDescriptor.setColorAttachmentPixelFormats() }
This code sets the color pixel formats to match the render target textures. ➤ In createGBufferPSO(colorPixelFormat:tiled:), after setting colorAttachments[0].pixelFormat, add: if tiled { pipelineDescriptor.colorAttachments[0].pixelFormat = colorPixelFormat }
In the previous chapter, your G-buffer render pass descriptor had no texture in colorAttachments[0]. However, when you use the view’s current render pass descriptor, colorAttachment[0] stores the view’s current drawable texture, so you match that texture’s pixel format.
raywenderlich.com
381
Metal by Tutorials
Chapter 15: Tile-Based Deferred Rendering
Now you store the textures in tile memory and use a single render pass.
A single render pass ➤ Build and run the app.
The final render Finally, you’ll see the result you want. The render is the same whether you choose Tiled Deferred or Deferred.
raywenderlich.com
382
Metal by Tutorials
Chapter 15: Tile-Based Deferred Rendering
➤ With the Tiled Deferred option selected, capture the GPU workload. You’ll see that all your textures, aside from the shadow pass, process in the single render pass.
The final frame capture Your four memoryless render target textures show up as Memoryless on the capture, proving they aren’t taking up any system memory. Now for the exciting part — to see how many lights you can run at 60 frames per second. ➤ Open SceneLighting.swift, and locate in init(): pointLights = Self.createPointLights( count: 40, min: [-6, 0.1, -6], max: [6, 1, 6])
➤ While running your app and choosing the Deferred option, slowly raise the count of point lights until you’re no longer getting 60 frames per second. Then see how many point lights you can get when choosing the Tiled Deferred option. On Tiled Deferred, my M1 Mac mini runs 18,000 point lights at 60 FPS in a small window. On Deferred, it’ll only achieve 38 FPS with the same number of lights. Don’t attempt Forward Rendering with this many lights!
raywenderlich.com
383
Metal by Tutorials
Chapter 15: Tile-Based Deferred Rendering
Stencil Tests The last step in completing your deferred rendering is to fix the sky. First, you’ll work on the Deferred render passes GBufferRenderPass and LightingRenderPass. Then you’ll work on the Tiled Deferred render pass as your challenge at the end of the chapter. Currently, when you render the quad in the lighting render pass, you accumulate the directional lighting on all the quad’s fragments. Wouldn’t it be great to only process fragments where model geometry is rendered? Fortunately, that’s what stencil testing was designed to do. In the following image, the stencil texture is on the right. The black area should mask the image so that only the white area renders.
Stencil testing As you already know, part of rasterization is performing a depth test to ensure the current fragment is in front of any fragments already rendered. The depth test isn’t the only test the fragment has to pass. You can configure a stencil test. Up to now, when you created the MTLDepthStencilState, you only configured the depth test. In the pipeline state objects, you set the depth pixel format to depth32float with a matching depth texture. A stencil texture consists of 8-bit values, from 0 to 255. You’ll add this texture to the depth buffer so that the depth buffer will consist of both depth texture and stencil texture.
raywenderlich.com
384
Metal by Tutorials
Chapter 15: Tile-Based Deferred Rendering
For a better understanding of the stencil buffer, examine the following image.
A stencil texture In this scenario, the buffer is initially cleared with zeros. When the pink triangle renders, the rasterizer increments the fragments the triangle covers. The second yellow triangle renders, and the rasterizer again increments the fragments that the triangle covers.
Stencil Test Configuration All fragments must pass both the depth and the stencil test that you configure to render. As part of the configuration you set: • The comparison function. • The operation on pass or fail. • A read and write mask. Take a closer look at the comparison function.
1. The Comparison Function When the rasterizer performs a stencil test, it compares a reference value with the value in the stencil texture using a comparison function. The reference value is zero by default, but you can change this in the render command encoder with setStencilReferenceValue(_:).
raywenderlich.com
385
Metal by Tutorials
Chapter 15: Tile-Based Deferred Rendering
The comparison function is a mathematical comparison operator, such as equal or lessEqual. A comparison function of always will let the fragment pass the stencil test, whereas with a stencil comparison of never, the fragment will always fail. For instance, if you want to use the stencil buffer to mask out the yellow triangle area in the previous example, you could set a reference value of 2 in the render command encoder and then set the comparison to notEqual. Only fragments that don’t have their stencil buffer set to 2 will pass the stencil test.
2. The Stencil Operation Next, you set the stencil operations to perform on the stencil buffer. There are three possible results to configure: • Stencil test failure. • Stencil test pass and depth failure. • Stencil test pass and depth pass. The default operation for each result is keep, which doesn’t change the stencil buffer. Other operations include: • incrementClamp: The stencil buffer increments the stencil buffer fragment until the maximum of 255. • incrementWrap: The stencil buffer increments the stencil buffer fragment and, if necessary, wraps around from 255 to 0. • decrementClamp and decrementWrap: The same as increment, except the stencil buffer value decreases. • invert: Performs a bitwise NOT operation, which inverts all of the bits. • replace: Replaces the stencil buffer fragment with the reference value. To get the stencil buffer to increase when a triangle renders in the previous example, you perform the incrementClamp operation when the fragment passes the depth test.
raywenderlich.com
386
Metal by Tutorials
Chapter 15: Tile-Based Deferred Rendering
3. The Read and Write Mask There’s one more wrinkle. You can specify a read mask and a write mask. By default, these masks are 255 or 11111111 in binary. When you test a bit value against 1, the value doesn’t change. Now that you have the concept and principles under your belt, it’s time to learn what all this means.
Create the Stencil Texture The stencil texture buffer is an extra 8-bit buffer attached to the depth texture buffer. You optionally configure it when you configure the depth buffer. ➤ Open Pipelines.swift. In createGBufferPSO(colorPixelFormat:tiled:), after pipelineDescriptor.depthAttachmentPixelFormat = .depth32Float, add: if !tiled { pipelineDescriptor.depthAttachmentPixelFormat = .depth32Float_stencil8 pipelineDescriptor.stencilAttachmentPixelFormat = .depth32Float_stencil8 }
This code configures both the depth and stencil attachment to use one texture, including the 32-bit depth and the 8-bit stencil buffers. ➤ Open GBufferRenderPass.swift. In resize(view:size:), change depthTexture to: depthTexture = Self.makeTexture( size: size, pixelFormat: .depth32Float_stencil8, label: "Depth and Stencil Texture")
Here, you create the texture with the matching pixel format. ➤ In draw(commandBuffer:scene:uniforms:params:), after configuring the descriptor’s depth attachment, add: descriptor?.stencilAttachment.texture = depthTexture descriptor?.stencilAttachment.storeAction = .store
raywenderlich.com
387
Metal by Tutorials
Chapter 15: Tile-Based Deferred Rendering
With this code, you tell the descriptor to use the depth texture as the stencil attachment and store the texture after use. ➤ Build and run the app, and choose the Deferred option. ➤ Capture the GPU workload and examine the command buffer.
New stencil texture Sure enough, you now have a stencil texture along with your other textures.
Configure the Stencil Operation ➤ Open GBufferRenderPass.swift, and add this new method: static func buildDepthStencilState() -> MTLDepthStencilState? { let descriptor = MTLDepthStencilDescriptor() descriptor.depthCompareFunction = .less descriptor.isDepthWriteEnabled = true return Renderer.device.makeDepthStencilState( descriptor: descriptor) }
This is the same method to create a depth stencil state object in RenderPass, but you’ll override it with your stencil configuration.
raywenderlich.com
388
Metal by Tutorials
Chapter 15: Tile-Based Deferred Rendering
➤ Add the following code to buildDepthStencilState() before return: let frontFaceStencil = MTLStencilDescriptor() frontFaceStencil.stencilCompareFunction = .always frontFaceStencil.stencilFailureOperation = .keep frontFaceStencil.depthFailureOperation = .keep frontFaceStencil.depthStencilPassOperation = .incrementClamp descriptor.frontFaceStencil = frontFaceStencil frontFaceStencil affects the stencil buffer only for models’ faces facing the
camera. The stencil test will always pass, and nothing happens if the stencil or depth tests fail. If the depth and stencil tests pass, the stencil buffer increases by 1. ➤ Build and run the app, and choose the Deferred option. ➤ Capture the GPU workload and examine the stencil buffer with the magnifier:
The ground is rendered in front of the trees and sometimes fails the depth test Most of the texture is mid-gray with a value of 1. On the trees, which are mostly 1, there are small patches of 2, which incidentally uncovers some inefficient overlapping geometry in the tree model. It’s important to realize that the geometry is processed in the order it’s rendered. In GameScene, this is set up as: models = [treefir1, treefir2, treefir3, train, ground]
The ground is the last to render. It fails the depth test when the fragment is behind a tree or the train and doesn’t increment the stencil buffer. Compare this with a stencil test where the ground is the first to render. raywenderlich.com
389
Metal by Tutorials
Chapter 15: Tile-Based Deferred Rendering
➤ Open GameScene.swift. In init(), change the models assignment to: models = [ground, treefir1, treefir2, treefir3, train]
This code renders the ground first. ➤ Build and run the app, and choose the Deferred option. ➤ Capture the GPU workload and compare the stencil texture.
Ground renders first When the tree renders this time, the ground passing the depth test has already incremented the stencil buffer to 1, so the tree passes the depth test and increments the buffer to 2, then 3 when there is extra geometry. You now have a stencil texture with zero where no geometry renders and non-zero where there is geometry. All this aims to compute deferred lighting only in those areas with geometry. You can achieve this with your current stencil texture. Where the stencil buffer is zero, you can ignore the fragment in the light render pass. To achieve this, you’ll: 1. Pass in the depth/stencil texture from GBufferRenderPass to LightingRenderPass. 2. In addition to setting LightingRenderPass‘s render pass descriptor’s stencil attachment, you must assign the depth texture to the descriptor’s depth attachment because you combine the stencil texture with depth. 3. LightingRenderPass uses two pipeline states: one for the sun and one for point lights. Both must have the depth and stencil pixel format of depth32float_stencil.
raywenderlich.com
390
Metal by Tutorials
Chapter 15: Tile-Based Deferred Rendering
1. Passing in the Depth/Stencil Texture ➤ Open LightingRenderPass.swift, and add a new texture property to LightingRenderPass: weak var stencilTexture: MTLTexture?
➤ Add this line to the top of draw(commandBuffer:scene:uniforms:params:): descriptor?.stencilAttachment.texture = stencilTexture
➤ Open Renderer.swift. In draw(scene:in:), add this line where you assign the textures to lightingRenderPass: lightingRenderPass.stencilTexture = gBufferRenderPass.depthTexture
2. Setting Up the Render Pass Descriptor ➤ Open LightingRenderPass.swift. At the top of draw(commandBuffer:scene:uniforms:params:), add: descriptor?.depthAttachment.texture = stencilTexture descriptor?.stencilAttachment.loadAction = .load descriptor?.depthAttachment.loadAction = .dontCare
You set the stencil attachment to load so that the LightingRenderPass can use the stencil texture for stencil testing. You don’t need the depth texture, so you set a load action of dontCare.
3. Changing the Pipeline State Objects ➤ Open Pipelines.swift. In both createSunLightPSO(colorPixelFormat:) and createPointLightPSO(colorPixelFormat:), after pipelineDescriptor.depthAttachmentPixelFormat = .depth32Float, add: if !tiled { pipelineDescriptor.depthAttachmentPixelFormat = .depth32Float_stencil8 pipelineDescriptor.stencilAttachmentPixelFormat = .depth32Float_stencil8 }
raywenderlich.com
391
Metal by Tutorials
Chapter 15: Tile-Based Deferred Rendering
This code configures the pipeline state to match the render pass descriptor’s depth and stencil texture pixel format. ➤ Build and run the app, and choose the Deferred option. ➤ Capture the GPU workload and examine the frame so far.
Stencil texture in frame capture LightingRenderPass correctly receives the stencil buffer from GBufferRenderPass.
Masking the Sky When you render the quad in LightingRenderPass, you want to bypass all fragments that are zero in the stencil buffer. ➤ Open LightingRenderPass.swift, and add this code to buildDepthStencilState() before return: let frontFaceStencil = MTLStencilDescriptor() frontFaceStencil.stencilCompareFunction = .equal frontFaceStencil.stencilFailureOperation = .keep frontFaceStencil.depthFailureOperation = .keep frontFaceStencil.depthStencilPassOperation = .keep descriptor.frontFaceStencil = frontFaceStencil
(Spoiler: Deliberate mistake. :]) raywenderlich.com
392
Metal by Tutorials
Chapter 15: Tile-Based Deferred Rendering
You haven’t changed the reference value in the render command encoder, so the reference value is zero. Here, you say that all stencil buffer fragments equal to zero will pass the stencil test. You don’t need to change the stencil buffer, so all of the operations are keep. ➤ Build and run the app, and choose the Deferred option.
A deliberate mistake In this render, all fragments that are zero render. That’s the top part. The bottom section with the plane and trees doesn’t render but shows the clear blue sky background. Of course, it should be the other way around. ➤ In buildDepthStencilState(), change the stencil compare function: frontFaceStencil.stencilCompareFunction = .notEqual
➤ Build and run the app, then choose the Deferred option.
Clear blue skies At last, the brooding, stormy sky is replaced by the Metal view’s blue MTLClearColor that you set way back in Renderer’s initializer.
raywenderlich.com
393
Metal by Tutorials
Chapter 15: Tile-Based Deferred Rendering
Challenge You fixed the sky for your Deferred Rendering pass. Your challenge is now to fix it in the Tiled Deferred render pass. Here’s a hint: just follow the steps for the Deferred render pass. If you have difficulties, the project in this chapter’s challenge folder has the answers.
Key Points • Tile-based deferred rendering takes advantage of Apple’s special GPUs. • Keeping data in tile memory rather than transferring to system memory is much more efficient and uses less power. • Mark textures as memoryless to keep them in tile memory. • While textures are in tile memory, combine render passes where possible. • Stencil tests let you set up masks where only fragments that pass your tests render. • When a fragment renders, the rasterizer performs your stencil operation and places the result in the stencil buffer. With this stencil buffer, you control which parts of your image renders.
Where to Go From Here? Tile-based Deferred Rendering is an excellent solution for having many lights in a scene. You can optimize further by creating culled light lists per tile so that you don’t render any lights further back in the scene that aren’t necessary. Apple’s Modern Rendering with Metal 2019 video (https://apple.co/3mfdtEY) will help you understand how to do this. The video also points out when to use various rendering technologies.
raywenderlich.com
394
16
Chapter 16: GPU Compute Programming
General Purpose GPU (GPGPU) programming uses the many-core GPU architecture to speed up parallel computation. Data-parallel compute processing is useful when you have large chunks of data and need to perform the same operation on each chunk. Examples include machine learning, scientific simulations, ray tracing and image/video processing. In this chapter, you’ll perform some simple GPU programming and explore how to use the GPU in ways other than vertex rendering.
raywenderlich.com
395
Metal by Tutorials
Chapter 16: GPU Compute Programming
The Starter Project ➤ Open Xcode and build and run this chapter’s starter project. The scene contains a lonely warrior. The renderer is the forward renderer using your Phong shader.
The starter project From this render, you might think that the warrior is left-handed. Depending on how you render him, he can be ambidextrous. ➤ Press 1 on your keyboard. The view changes to the front view. However, the warrior faces towards positive z instead of toward the camera.
Facing backwards The way the warrior renders is due to both math and file formats. In Chapter 6, “Coordinate Spaces”, you learned that this book uses a left-handed coordinate system. Blender exports the obj file for use in a right-handed coordinate system.
raywenderlich.com
396
Metal by Tutorials
Chapter 16: GPU Compute Programming
If you want a right-handed warrior, there are a few ways to solve this issue: 1. Rewrite all of your coordinate positioning. 2. In vertex_main, invert position.z when rendering the model. 3. On loading the model, invert position.z. If all of your models are reversed, option #1 or #2 might be good. However, if you only need some models reversed, option #3 is the way to go. All you need is a fast parallel operation. Thankfully, one is available to you using the GPU. Note: Ideally, you would convert the model as part of your model pipeline rather than in your final app. After flipping the vertices, you can write the model out to a new file.
Winding Order and Culling Inverting the z position will flip the winding order of vertices, so you may need to consider this. When Model I/O reads in the model, the vertices are in clockwise winding order. ➤ To demonstrate this, open ForwardRenderPass.swift. ➤ In draw(commandBuffer:scene:uniforms:params:), add this code after renderEncoder.setRenderPipelineState(pipelineState): renderEncoder.setFrontFacing(.counterClockwise) renderEncoder.setCullMode(.back)
Here, you tell the GPU to expect vertices in counterclockwise order. The default is clockwise. You also tell the GPU to cull any faces that face away from the camera. As a general rule, you should cull back faces since they’re usually hidden, and rendering them isn’t necessary.
raywenderlich.com
397
Metal by Tutorials
Chapter 16: GPU Compute Programming
➤ Build and run the app.
Rendering with incorrect winding order Because the winding order of the mesh is currently clockwise, the GPU is culling the wrong faces, and the model appears to be inside-out. Rotate the model to see this more clearly. Inverting the z coordinates will correct the winding order.
Reversing the Model on the CPU Before working out the parallel algorithm for the GPU, you’ll first explore how to reverse the warrior on the CPU. You’ll compare the performance with the GPU result. In the process, you’ll learn how to access and change Swift data buffer contents with pointers. ➤ In the Geometry group, open VertexDescriptor.swift. Take a moment to refresh your memory about the layout in which Model I/O loads the model buffers in defaultLayout. Five buffers are involved, but you’re only interested in the first one, VertexBuffer. It consists of a float3 for Position and a float3 for Normal. You don’t need to consider UVs because they’re in the next layout. ➤ In the Shaders group, open Common.h, and add a new structure: struct VertexLayout { vector_float3 position; vector_float3 normal; };
This structure matches the layout in the vertex descriptor for buffer 0. You’ll use it to read the loaded mesh buffer.
raywenderlich.com
398
Metal by Tutorials
Chapter 16: GPU Compute Programming
➤ In the Game group, open GameScene.swift, and add a new method to GameScene: mutating func convertMesh(_ model: Model) { let startTime = CFAbsoluteTimeGetCurrent() for mesh in model.meshes { // 1 let vertexBuffer = mesh.vertexBuffers[VertexBuffer.index] let count = vertexBuffer.length / MemoryLayout.stride // 2 var pointer = vertexBuffer .contents() .bindMemory(to: VertexLayout.self, capacity: count) // 3 for _ in 0.. maxEmitters { emitters.removeFirst() } let emitter = FireworksEmitter( particleCount: particleCount, size: size, life: life) emitters.append(emitter) }
You reset a timer variable every time it reaches a threshold (50 in this case). At that point, you add a new emitter and then remove the oldest one. Build and run the app to verify everything is working. Note, however, you’ll see the same solid dark blue color as you did in the beginning as you’re not doing any drawing yet. The shader function is where you’ll update the particles’ life and position. Each of the particles is updated independently, and so will work well with the granular control of GPU threads in compute encoding!
The Compute Pipeline State Object ➤ Open Pipelines.swift and add this to PipelineStates: static func createComputePSO(function: String) -> MTLComputePipelineState { guard let kernel = Renderer.library.makeFunction(name: function) else { fatalError("Unable to create \(function) PSO") } let pipelineState: MTLComputePipelineState do { pipelineState = try Renderer.device.makeComputePipelineState(function: kernel) } catch { fatalError(error.localizedDescription) } return pipelineState }
raywenderlich.com
416
Metal by Tutorials
Chapter 17: Particle Systems
As you learned in the previous chapter, for a compute pipeline state, you don’t need a pipeline state descriptor. You simply create the pipeline state directly from the kernel function. You’ll be able to use this method to create compute pipeline state objects using only the name of the kernel.
The Fireworks Pass ➤ Open Fireworks.swift, and add the pipeline states and initializer: let clearScreenPSO: MTLComputePipelineState let fireworksPSO: MTLComputePipelineState init() { clearScreenPSO = PipelineStates.createComputePSO(function: "clearScreen") fireworksPSO = PipelineStates.createComputePSO(function: "fireworks") }
When you display your particles on the screen, you’ll write them to the drawable texture. Because you’re not going through the render pipeline, you will need to clear the texture to a night sky black. You’ll do this in a simple compute shader. Note: In this particular application, because you’re using the drawable texture, you could set the initial metal view’s clear color to black instead of running a clear screen kernel function. But clearing the drawable texture will give you practice in using a 2D grid, and the skill to clear other textures too. After clearing the screen, you’ll then be able to calculate the fireworks particles. You will require a different kernel function for each process, requiring two different pipeline state objects.
raywenderlich.com
417
Metal by Tutorials
Chapter 17: Particle Systems
Clearing the Screen The clear screen kernel function will run on every pixel in the drawable texture. The texture has a width and height, and is therefore a two dimensional grid. ➤ Still in Fireworks.swift, add this to draw(commandBuffer:view:): // 1 guard let computeEncoder = commandBuffer.makeComputeCommandEncoder(), let drawable = view.currentDrawable else { return } computeEncoder.setComputePipelineState(clearScreenPSO) computeEncoder.setTexture(drawable.texture, index: 0) // 2 var threadsPerGrid = MTLSize( width: Int(view.drawableSize.width), height: Int(view.drawableSize.height), depth: 1) // 3 let width = clearScreenPSO.threadExecutionWidth var threadsPerThreadgroup = MTLSize( width: width, height: clearScreenPSO.maxTotalThreadsPerThreadgroup / width, depth: 1) // 4 computeEncoder.dispatchThreads( threadsPerGrid, threadsPerThreadgroup: threadsPerThreadgroup) computeEncoder.endEncoding()
Going through the code: 1. You create the compute command encoder and set its pipeline state. You also send the drawable texture to the GPU at index 0 for writing. 2. The grid size is the drawable’s width by the drawable’s height. 3. Using the pipeline state thread values, you calculate the number of threads per threadgroup. 4. You dispatch these threads to do the work in parallel on the GPU. ➤ In the Shaders group, create a new Metal file named Fireworks.metal, and add the kernel functions you set up in your pipeline states: #import "Common.h" kernel void clearScreen(
raywenderlich.com
418
Metal by Tutorials
{ }
Chapter 17: Particle Systems
texture2d output [[texture(0)]], uint2 id [[thread_position_in_grid]]) output.write(half4(0.0, 0.0, 0.0, 1.0), id);
kernel void fireworks() { } clearScreen takes, as arguments, the drawable texture you sent from the CPU and
the 2D thread index. You specify in the parameter that you will need write access. Then, you write the color black into the drawable texture for each thread/pixel. The kernel function’s id parameter uses the [[thread_position_in_grid]] attribute qualifier which uniquely locates a thread within the compute grid and enables it to work distinctly from the others. In this case, it identifies each pixel in the texture. You’ll return and fill out fireworks shortly. ➤ Open Renderer.swift, and add this at the end of init(metalView:options:): metalView.framebufferOnly = false
Because you’re writing to the view’s drawable texture in clearScreen, you need to change the underlying render target usage by enabling writing to the frame buffer. ➤ Take a look at draw(scene:in:). Renderer is already set up to call fireworks(update:size:) and fireworks.draw(commandBuffer:view:).
➤ Build and run the app. clearScreen writes the color to the view’s drawable texture, so you’ll finally see the
view color turns from blue to a night sky ready to display your fireworks.
Drawable cleared to black raywenderlich.com
419
Metal by Tutorials
Chapter 17: Particle Systems
Dispatching the Particle Buffer Now that you’ve cleared the screen, you’ll set up a new encoder to dispatch the particle buffer to the GPU. ➤ Open Fireworks.swift, and add this to the end of draw(commandBuffer:view:): // 1 guard let particleEncoder = commandBuffer.makeComputeCommandEncoder() else { return } particleEncoder.setComputePipelineState(fireworksPSO) particleEncoder.setTexture(drawable.texture, index: 0) // 2 threadsPerGrid = MTLSize(width: particleCount, height: 1, depth: 1) for emitter in emitters { // 3 let particleBuffer = emitter.particleBuffer particleEncoder.setBuffer(particleBuffer, offset: 0, index: 0) threadsPerThreadgroup = MTLSize( width: fireworksPSO.threadExecutionWidth, height: 1, depth: 1) particleEncoder.dispatchThreads( threadsPerGrid, threadsPerThreadgroup: threadsPerThreadgroup) } particleEncoder.endEncoding()
This code is very similar to the previous compute encoder setup: 1. You create a second command encoder and set the particle pipeline state and drawable texture to it. 2. You change the dimensionality from 2D to 1D and set the number of threads per grid to equal the number of particles. 3. You dispatch threads for each emitter in the array. Since your threadsPerGrid is now 1D, you’ll need to match [[thread_position_in_grid]] in the shader kernel function with a uint parameter. Threads will not be dispatched for each pixel anymore but rather for each particle, so [[thread_position_in_grid]] in this case will only affect a particular pixel in the drawable texture if there is a particle positioned at that pixel. All right, time for some physics chatter!
raywenderlich.com
420
Metal by Tutorials
Chapter 17: Particle Systems
Particle Dynamics Particle dynamics makes heavy use of Newton’s laws of motion. Particles are considered to be small objects approximated as point masses. Since volume is not something that characterizes particles, scaling or rotational motion will not be considered. Particles will, however, make use of translation motion so they’ll always need to have a position. Besides a position, particles might also have a direction and speed of movement (velocity), forces that influence them (e.g., gravity), a mass, a color and an age. Since the particle footprint is so small in memory, modern GPUs can generate 4+ million particles, and they can follow the laws of motion at 60 fps. For now, you’re going to ignore gravity, so its value will be 0. Time in this example won’t change so it will have a value of 1. As a consequence, velocity will always be the same. You can also assume the particle mass is always 1, for convenience. To calculate the position, you use this formula:
Where x2 is the new position, x1 is the old position, v1 is the old velocity, t is time, and a is acceleration. This is the formula to calculate the new velocity from the old one:
However, since the acceleration is 0 in the fireworks example, the velocity will always have the same value. As you might remember from Physics class, the formula for velocity is: velocity = speed * direction
Plugging all this information into the first formula above gives you the final equation to use in the kernel. Again, as for the second formula, since acceleration is 0, the last term cancels out: newPosition = oldPosition * velocity
raywenderlich.com
421
Metal by Tutorials
Chapter 17: Particle Systems
Finally, you’re creating exploding fireworks, so your firework particles will move in a circle that keeps growing away from the initial emitter origin, so you need to know the equation of a circle:
Using the angle that the particle direction makes with the axes, you can re-write the velocity equation from the parametric form of the circle equation using the trigonometric functions sine and cosine as follows: xVelocity = speed * cos(direction) yVelocity = speed * sin(direction)
Great! Why don’t you write all this down in code now?
Implementing Particle Physics ➤ Open Fireworks.metal and replace fireworks with: kernel void fireworks( texture2d output [[texture(0)]], // 1 device Particle *particles [[buffer(0)]], uint id [[thread_position_in_grid]]) { // 2 float xVelocity = particles[id].speed * cos(particles[id].direction); float yVelocity = particles[id].speed * sin(particles[id].direction) + 3.0; particles[id].position.x += xVelocity; particles[id].position.y += yVelocity; // 3 particles[id].life -= 1.0; half4 color; color = half4(particles[id].color) * particles[id].life / 255.0; // 4 color.a = 1.0; uint2 position = uint2(particles[id].position); output.write(color, position); output.write(color, position + uint2(0, 1)); output.write(color, position - uint2(0, 1)); output.write(color, position + uint2(1, 0)); output.write(color, position - uint2(1, 0)); }
raywenderlich.com
422
Metal by Tutorials
Chapter 17: Particle Systems
Going through this code: 1. Get the particle buffer from the CPU and use a 1D index to match the grid dimensions you dispatched earlier. 2. Compute the velocity and update the position for the current particle according to the laws of motion and using the circle equation as explained above. 3. Update the life variable and compute a new color after each update. The color will fade as the value held by the life variable gets smaller and smaller. 4. Write the updated color at the current particle position, as well as its neighboring particles to the left, right, top and bottom to create the look and feel of a thicker particle. ➤ Build and run the app, and finally you get to enjoy some fireworks!
Fireworks! You can improve the realism of particle effects in at least a couple of ways. One of them is to attach a sprite or a texture to each particle in a render pass. Instead of a dull point, you’ll then be able to see a textured point which looks way more lively. You can practice this technique in the next particle endeavor: a snowing simulation.
Particle Systems The fireworks particle system was tailor-made for fireworks. However, particle systems can be very complex with many different options for particle movement, colors and sizes. In the Particles group, the Emitter class in the starter project is a simple example of a generic particle system where you can create many different types of particles using a particle descriptor. raywenderlich.com
423
Metal by Tutorials
Chapter 17: Particle Systems
For example, you’re going to create snow falling, but also a fire blazing upwards. These particle systems will have different speeds, textures and directions. ➤ In the Particles group, open ParticleEffects.swift, and examine the code. To create a particle system, you first create a descriptor which describes all the characteristics of each individual particle. Many of the properties in ParticleDescriptor are ClosedRanges. For example, as well as position, there is a positionXRange and positionYRange. This allows you to specify a starting position but also allows randomness within limits. If you specify a position of [10, 0], and a positionXRange of 0...180, then each particle will be within the range of 10 to 190. Each particle has a startScale and an endScale. By setting the startScale to 1 and the endScale to 0, you can make the particle get smaller over its lifespan.
Scaling over time ➤ Open Emitter.swift. When you create the emitter, as well as the particle descriptor, you also specify: • texture: You’ll render this texture for each particle using the particle coordinates. • particleCount: The total number of particles. • birthRate: How many particles should generate at one time. • birthDelay: Delays particle generation. This allows you to slowly release particles (like a gentle snow flurry) or send them out more quickly (like a blazing fire). • blending: Some particle effects require blending, such as you did for your point lights in Chapter 14, “Deferred Rendering”.
raywenderlich.com
424
Metal by Tutorials
Chapter 17: Particle Systems
Emitter creates a buffer the size of all the particles. emit() processes each new
particle and creates it with the particle settings you set up in the particle descriptor. More complex particle systems would maintain a live buffer and a dead buffer. As particles die, they move from live to dead, and as the system requires new particles, it recovers them from dead. However, in this more simple system, a particle never dies. As soon as a particle’s age reaches its life-span, it’s reborn with the values it started with.
Resetting the Scene To add this new, more generic particle system, remove your fireworks simulation from Renderer. ➤ Open Renderer.swift and remove: var fireworks: Fireworks
➤ In init(metalView:options:), remove: fireworks = Fireworks()
➤ In draw(scene:in:), remove: // Render Fireworks with compute shaders fireworks.update(size: view.drawableSize) fireworks.draw(commandBuffer: commandBuffer, view: view)
➤ Build and run the app, and your screen will show the metal view’s initial clear color.
Reset project Your game may require many different particle effects, so you’ll add them to the game scene.
raywenderlich.com
425
Metal by Tutorials
Chapter 17: Particle Systems
➤ Open GameScene.swift, and add a new property to GameScene: var particleEffects: [Emitter] = []
Each particle effect will be an emitter, which you’ll add to this array.
Updating the Particle Structure With a more complex particle system, you need to store more particle properties. ➤ Open Common.h, and add this to the end of Particle: float age; float size; float scale; float startScale; float endScale; vector_float2 startPosition;
You’ll now be able to decay the size of the particle over time. ➤ Open Emitter.swift and remove the Particle structure. Emitter will now use the structure in Common.h.
Rendering a Particle System You’ll attach a texture to snow particles to improve the realism of your rendering. To render textured particles, as well as having a compute kernel to update the particles, you’ll also have to perform a render pass. ➤ In the Render Passes group, open ParticlesRenderPass.swift. This is a skeleton render pass with the minimum requirements to conform to RenderPass. It’s already set up to render in Renderer. draw(commandBuffer:scene:uniforms:params:) first calls update(commandBuffer:scene:). This is where you’ll first create the compute
pipeline to update the particles. You’ll then build the pipeline for rendering the particles in render(commandBuffer:scene:).
raywenderlich.com
426
Metal by Tutorials
Chapter 17: Particle Systems
➤ In ParticlesRenderPass, create the necessary pipeline state properties and replace init(view:): let computePSO: MTLComputePipelineState let renderPSO: MTLRenderPipelineState let blendingPSO: MTLRenderPipelineState init(view: MTKView) { computePSO = PipelineStates.createComputePSO( function: "computeParticles") renderPSO = PipelineStates.createParticleRenderPSO( pixelFormat: view.colorPixelFormat) blendingPSO = PipelineStates.createParticleRenderPSO( pixelFormat: view.colorPixelFormat, enableBlending: true) } computeParticles is the kernel function where you’ll update the particles every
frame. The two render pipeline states will use the vertex function vertex_particle to position a point on the screen, and fragment_particle to draw the sprite texture into the point. ➤ Add this code to update(commandBuffer:scene:): // 1 guard let computeEncoder = commandBuffer.makeComputeCommandEncoder() else { return } computeEncoder.label = label computeEncoder.setComputePipelineState(computePSO) // 2 let threadsPerGroup = MTLSize( width: computePSO.threadExecutionWidth, height: 1, depth: 1) // 3 for emitter in scene.particleEffects { emitter.emit() if emitter.currentParticles Emitter { // 1 var descriptor = ParticleDescriptor() descriptor.positionXRange = 0...Float(size.width) descriptor.direction = -.pi / 2 descriptor.speedRange = 2...6 descriptor.pointSizeRange = 80 * 0.5...80 descriptor.startScale = 0 descriptor.startScaleRange = 0.2...1.0 // 2 descriptor.life = 500 descriptor.color = [1, 1, 1, 1] // 3 return Emitter( descriptor, texture: "snowflake", particleCount: 100, birthRate: 1,
raywenderlich.com
432
Metal by Tutorials
}
Chapter 17: Particle Systems
birthDelay: 20)
Here’s what’s happening: 1. The descriptor describes how to initialize each particle. You set up ranges for position, speed and scale. Particles will appear at the top of the screen in random positions. 2. A particle has an age and a life-span. A snowflake particle will remain alive for 500 frames and then recycle. You want the snowflake to travel from the top of the screen all the way down to the bottom of the screen. life has to be long enough for this to happen. If you give your snowflake a short life, it will disappear while still on screen. 3. You tell the emitter how many particles in total should be in the system. birthRate and birthDelay control how fast the particles emit. With these parameters, you’ll emit one snowflake every twenty frames until there are 100 snowflakes in total. If you want a blizzard rather than a few flakes, then you can set the birthrate higher and the delay between each emission less. Particle parameters are really fun to experiment with. Once you have your snowflakes falling, change any of these parameters to see what the effect is. ➤ Open GameScene.swift, and add this to update(size: CGSize): let snow = ParticleEffects.createSnow(size: size) snow.position = [0, Float(size.height) + 100] particleEffects = [snow]
You set up the emitter off the top of the screen, so that particles don’t suddenly pop in. You then add it to the scene’s list of particles. Whenever the view resizes, you will reset the particle effects to fit in the new size.
raywenderlich.com
433
Metal by Tutorials
Chapter 17: Particle Systems
➤ Build and run the app, and enjoy the relaxing snow with variable snowflake speeds and sizes:
A snow particle system Go back and experiment with some of the particle settings. With particleCount of 800, birthDelay of 2 and speedRange of 4...8, you start off with a gentle snowfall that gradually turns into a veritable blizzard.
Fire Brrr. That snow is so cold, you need a fire. ➤ Open GameScene.swift, and replace particleEffects = [snow] with: let fire = ParticleEffects.createFire(size: size) fire.position = [0, 0] particleEffects = [snow, fire]
This positions the emitter just off the bottom of the screen.
raywenderlich.com
434
Metal by Tutorials
Chapter 17: Particle Systems
➤ Open ParticleEffects.swift and examine the createFire(size:) settings, and see if you can work out what the particle system will look like. You’re loading more particles than for snow, and a different texture. The birth rate is higher, and there’s no delay. The direction is upwards, with a slight variation in range. The particle scales down over its life. The color is fiery orange with blending enabled. ➤ Build and run to see this new particle system in action.
Fire and snow
raywenderlich.com
435
Metal by Tutorials
Chapter 17: Particle Systems
Key Points • Particle emitters emit particles. These particles carry information about themselves, such as position, velocity and color. • Particle attributes can vary over time. A particle may have a life and decay after a certain amount of time. • As each particle in a particle system has the same attributes, the GPU is a good fit for updating them in parallel. • Particle systems, depending on given attributes, can simulate physics or fluid systems, or even hair and grass systems.
Where to Go From Here? You’ve only just begun playing with particles. There are many more particle characteristics you could include in your particle system: • Color over life. • Gravity. • Acceleration. • Instead of scaling linearly over time, how about scaling slowly then faster? If you want more ideas, review the links in this chapter’s references.markdown. There are some things you haven’t yet looked at, like collisions or reaction to acting forces. You have also not read anything about intelligent agents and their behaviors. You’ll learn more about all this next, in Chapter 18, “Particle Behavior”.
raywenderlich.com
436
18
Chapter 18: Particle Behavior
As you learned in the previous chapter, particles have been at the foundation of computer animation for years. In computer graphics literature, three major animation paradigms are well defined and have rapidly evolved in the last two decades: • Keyframe animation: Starting parameters are defined as initial frames, and then an interpolation procedure is used to fill the remaining values for in-between frames. You’ll cover this topic in Chapter 23, “Animation”. • Physically based animation: Starting values are defined as animation parameters, such as a particle’s initial position and velocity, but intermediate values are not specified externally. This topic was covered in Chapter 17, “Particle Systems”. • Behavioral animation: Starting values are defined as animation parameters. In addition, a cognitive process model describes and influences the way intermediate values are later determined. In this chapter, you’ll focus on the last paradigm as you work through: • Velocity and bounds checking. • Swarming behavior. • Behavioral animation. • Behavioral rules.
raywenderlich.com
437
Metal by Tutorials
Chapter 18: Particle Behavior
By the end of the chapter, you’ll build and control a swarm exhibiting basic behaviors you might see in nature.
Behavioral Animation You can broadly split behavioral animation into two major categories: • Cognitive behavior: This is the foundation of artificial life which differs from artificial intelligence in that AI objects do not exhibit behaviors or have their own preferences. It can range from a simple cause-and-effect based system to more complex systems, known as agents, that have a psychological profile influenced by the surrounding environment. • Aggregate behavior: Think of this as the overall outcome of a group of agents. This behavior is based on the individual rules of each agent and can influence the behavior of neighbors. In this chapter, you’ll keep your focus on aggregate behavior.
raywenderlich.com
438
Metal by Tutorials
Chapter 18: Particle Behavior
There’s a strict correlation between the various types of aggregate behavior entities and their characteristics. In the following table, notice how the presence of a physics system or intelligence varies between entity types.
• Particles are the largest aggregate entities and are mostly governed by the laws of physics, but they lack intelligence. • Flocks are an entity that’s well-balanced between size, physics and intelligence. • Crowds are smaller entities that are rarely driven by physics rules and are highly intelligent. Working with crowd animation is both a challenging and rewarding experience. However, the purpose of this chapter is to describe and implement a flocking-like system, or to be more precise, a swarm of insects.
Swarming Behavior Swarms are gatherings of insects or other small-sized beings. The swarming behavior of insects can be modeled in a similar fashion as the flocking behavior of birds, the herding behavior of animals or the shoaling behavior of fish.
raywenderlich.com
439
Metal by Tutorials
Chapter 18: Particle Behavior
You know from the previous chapter that particle systems are fuzzy objects whose dynamics are mostly governed by the laws of physics. There are no interactions between particles, and usually, they are unaware of their neighboring particles. In contrast, swarming behavior uses the concept of neighboring quite heavily. The swarming behavior follows a set of basic movement rules developed in 1986 by Craig Reynolds in an artificial flocking simulation program known as Boids. Since this chapter is heavily based on his work, the term boid will be used throughout the chapter instead of particle. Initially, this basic set only included three rules: cohesion, separation and alignment. Later, more rules were added to extend the set to include a new type of agent; one that has autonomous behavior and is characterized by the fact that it has more intelligence than the rest of the swarm. This led to defining new models such as follow-the-leader and predator-prey. Time to transform all of this knowledge into a swarm of quality code.
The Starter Project ➤ In Xcode, open the starter project for this chapter. There are only a few files in this project. • In the Flocking group, Emitter.swift creates a buffer containing the particles. But each particle has only a position attribute. • Renderer calls a FlockingPass on every frame and supplies the view’s current drawable texture as the GPU texture to update. • FlockingPass first clears the texture using the clearScreen compute shader from the previous project. It then dispatches threads for the number of particles. The dispatch code in FlockingPass.draw(in:commandBuffer:) contains an example of both macOS code and iOS code where non-uniform threads are not supported. • In the Shaders group, Flocking.metal has two kernel functions. One clears the drawable texture, and the other writes a pixel, representing a boid, to the given texture.
raywenderlich.com
440
Metal by Tutorials
Chapter 18: Particle Behavior
➤ Build and run the project, and you’ll see this:
The starter app There’s a problem: a visibility issue. In its current state, the boids are barely distinguishable despite being white on a black background. There’s a neat trick you can apply in cases like this when you don’t want to use a texture for boids (like you used in the previous chapter). In fact, scientific simulations and computational fluid dynamics projects very rarely use textures, if ever. You can’t use the [[point_size]] attribute here because you’re not rendering in the traditional sense. Instead, you’re writing pixels in a kernel function directly to the drawable’s texture. The trick is to “paint” the surrounding neighbors of each boid, which makes the current boid seem larger than it really is.
Painting the pixels around the boid
raywenderlich.com
441
Metal by Tutorials
Chapter 18: Particle Behavior
➤ In Flocking.metal, add this code at the end of the boids kernel function: output.write(color, output.write(color, output.write(color, output.write(color, output.write(color, output.write(color, output.write(color, output.write(color,
location location location location location location location location
+ + + + + + + +
uint2(-1, 1)); uint2( 0, 1)); uint2( 1, 1)); uint2(-1, 0)); uint2( 1, 0)); uint2(-1, -1)); uint2( 0, -1)); uint2( 1, -1));
This code modifies the neighboring pixels around all sides of the boid which causes the boid to appear larger. ➤ Build and run the app, and you’ll see that the boids are more distinguishable now.
Larger boids That’s a good start, but how do you get them to move around? For that, you need to look into velocity.
raywenderlich.com
442
Metal by Tutorials
Chapter 18: Particle Behavior
Velocity Velocity is a vector made up of two other vectors: direction and speed. The speed is the magnitude or length of the vector, and the direction is given by the linear equation of the line on which the vector lies.
Properties of a vector ➤ In the Flocking group, open Emitter.swift, add a new member to the end of the Particle structure: var velocity: float2
➤ In init(particleCount:size:), inside the particle loop, add this before the last line where you advance the pointer: let velocity: float2 = [ Float.random(in: -5...5), Float.random(in: -5...5) ] pointer.pointee.velocity = velocity
This gives the particle (boid) a random direction and speed that ranges between -5 and 5. ➤ In the Shaders group, open Flocking.metal, and add velocity as a new member of the Boid structure: float2 velocity;
➤ In boids, add this code after the line where you define position: float2 velocity = boid.velocity; position += velocity; boid.position = position; boid.velocity = velocity; boids[id] = boid;
raywenderlich.com
443
Metal by Tutorials
Chapter 18: Particle Behavior
This code gets the current velocity, updates the current position with the velocity, and then updates the boid data before storing the new values. Build and run the app, and you’ll see that the boids are now moving everywhere on the screen and… uh, wait! It looks like they’re disappearing from the screen too. What happened? Although you set the velocity to random values, you still need a way to force the boids to stay on the screen. Essentially, you need a way to make the boids bounce back when they hit any of the edges.
Reflect and bounce at the edges For this function to work, you need to add checks for X and Y to make sure the boids stay in the rectangle defined by the origin and the size of the window, in other words, the width and height of your scene. ➤ In boids, add this code after float2 velocity = boid.velocity;: if (position.x < 0 || position.x > output.get_width()) { velocity.x *= -1; } if (position.y < 0 || position.y > output.get_height()) { velocity.y *= -1; }
Here, you check whether a boid coordinate gets outside the screen. If it does, you change the velocity sign, which changes the direction of the moving boid.
raywenderlich.com
444
Metal by Tutorials
Chapter 18: Particle Behavior
➤ Build and run the app, and you’ll see that the boids are now bouncing back when hitting an edge.
Bouncing boids Currently, the boids only obey the laws of physics. They’ll travel to random locations with random velocities, and they’ll stay on the window screen because of a few strict physical rules you’re imposing on them. The next stage is to make the boids behave as if they are able to think for themselves.
Behavioral Rules There’s a basic set of steering rules that swarms and flocks can adhere to, and it includes: • Cohesion • Separation • Alignment • Escaping • Dampening You’ll learn about each of these rules as you implement them in your project.
raywenderlich.com
445
Metal by Tutorials
Chapter 18: Particle Behavior
Cohesion Cohesion is a steering behavior that causes the boids to stay together as a group. To determine how cohesion works, you need to find the average position of the group, known as the center of gravity. Each neighboring boid will then apply a steering force in the direction of this center and converge near the center.
Cohesion ➤ In Flocking.metal, at the top of the file, add three global constants: constant float average = 100; constant float attenuation = 0.1; constant float cohesionWeight = 2.0;
With these constants, you defined: • average: A value that represents a smaller group of the swarm that stays cohesive. • attenuation: A toning down factor that lets you relax the cohesion rule. • cohesionWeight: The contribution made to the final cumulative behavior. ➤ Create a new function for cohesion before boids: float2 cohesion( uint index, device Boid* boids, uint particleCount) { // 1 Boid thisBoid = boids[index]; float2 position = float2(0);
raywenderlich.com
446
Metal by Tutorials
}
Chapter 18: Particle Behavior
// 2 for (uint i = 0; i < particleCount; i++) { Boid boid = boids[i]; if (i != index) { position += boid.position; } } // 3 position /= (particleCount - 1); position = (position - thisBoid.position) / average; return position;
Going through the code: 1. Isolate the current boid at the given index from the rest of the group. Define and initialize position. 2. Loop through all of the boids in the swarm, and accumulate each boid’s position to the position variable. 3. Get an average position value for the entire swarm, and calculate another averaged position based on the current boid position and the fixed value average that preserves average locality. ➤ In boids, add this code immediately before position += velocity: float2 cohesionVector = cohesion(id, boids, particleCount) * attenuation; // velocity accumulation velocity += cohesionVector * cohesionWeight;
Here, you determine the cohesion vector for the current boid and then attenuate its force. You’ll build upon the velocity accumulation line as you go ahead with new behavioral rules. For now, you give cohesion a weight of 2 and add it to the total velocity.
raywenderlich.com
447
Metal by Tutorials
Chapter 18: Particle Behavior
➤ Build and run the app. Notice how the boids are initially trying to get away — following their random directions. Moments later, they’re pulled back toward the center of the flock.
Converging boids
Separation Separation is another steering behavior that allows a boid to stay a certain distance from nearby neighbors. This is accomplished by applying a repulsion force to the current boid when the set threshold for proximity is reached.
Separation ➤ Add two more global constants: constant float limit = 20; constant float separationWeight = 1.0;
raywenderlich.com
448
Metal by Tutorials
Chapter 18: Particle Behavior
Here’s what they’re for: • limit: A value that represents the proximity threshold that triggers the repulsion force. • separationWeight: The contribution made by the separation rule to the final cumulative behavior. ➤ Then, add the new separation function before boids: float2 separation( uint index, device Boid* boids, uint particleCount) { // 1 Boid thisBoid = boids[index]; float2 position = float2(0); // 2 for (uint i = 0; i < particleCount; i++) { Boid boid = boids[i]; if (i != index) { if (abs(distance(boid.position, thisBoid.position)) < limit) { position = position - (boid.position - thisBoid.position); } } } return position; }
Going through the code: 1. Isolate the current boid at the given index from the rest of the group. Define and initialize position. 2. Loop through all of the boids in the swarm; if this is a boid other than the isolated one, check the distance between the current and isolated boids. If the distance is smaller than the proximity threshold, update the position to keep the isolated boid within a safe distance. ➤ In boids, before the // velocity accumulation comment, add this: float2 separationVector = separation(id, boids, particleCount) * attenuation;
raywenderlich.com
449
Metal by Tutorials
Chapter 18: Particle Behavior
➤ Then, update the velocity accumulation to include the separation contribution by adding this code immediately after the last: velocity += cohesionVector * cohesionWeight + separationVector * separationWeight;
➤ Build and run the project. Notice that now there’s a counter-effect of pushing back from cohesion as a result of the separation contribution.
Boid separation
Alignment Alignment is the last of the three steering behaviors Reynolds used for his flocking simulation. The main idea is to calculate an average of the velocities for a limited number of neighbors. The resulting average is often referred to as the desired velocity. With alignment, a steering force gets applied to the current boid’s velocity to make it align with the group.
Alignment raywenderlich.com
450
Metal by Tutorials
Chapter 18: Particle Behavior
➤ To get this working, add two global constants: constant float neighbors = 8; constant float alignmentWeight = 3.0;
With these constants, you define: • neighbors: A value that represents the size of the local group that determines the “desired velocity”. • alignmentWeight: The contribution made by the alignment rule to the final cumulative behavior. ➤ Then, add the new alignment function before boids: float2 alignment( uint index, device Boid* boids, uint particleCount) { // 1 Boid thisBoid = boids[index]; float2 velocity = float2(0); // 2 for (uint i = 0; i < particleCount; i++) { Boid boid = boids[i]; if (i != index) { velocity += boid.velocity; } } // 3 velocity /= (particleCount - 1); velocity = (velocity - thisBoid.velocity) / neighbors; return velocity; }
Going through the code: 1. Isolate the current boid at the given index from the rest of the group. Define and initialize velocity. 2. Loop through all of the boids in the swarm, and accumulate each boid’s velocity to the velocity variable. 3. Get an average velocity value for the entire swarm, and then calculate another averaged velocity based on the current boid velocity and the size of the local group, neighbors, which preserves locality.
raywenderlich.com
451
Metal by Tutorials
Chapter 18: Particle Behavior
➤ In boids, before the // velocity accumulation comment, add this code: float2 alignmentVector = alignment(id, boids, particleCount) * attenuation;
➤ Then, add this to update the velocity accumulation to include the alignment contribution: velocity += cohesionVector * cohesionWeight + separationVector * separationWeight + alignmentVector * alignmentWeight;
➤ Build and run the app. The flock is homogeneous now because the alignment contribution brings balance to the previous two opposed contributions.
Boids aligning
Escaping Escaping is a new type of steering behavior that introduces an agent with autonomous behavior and slightly more intelligence — the predator.
raywenderlich.com
452
Metal by Tutorials
Chapter 18: Particle Behavior
In the predator-prey behavior, the predator tries to approach the closest prey on one side, while on the other side, the neighboring boids try to escape.
Escaping ➤ Like before, add new global constants to indicate the weight of the escaping force and the speed of reaction to the predator: constant float escapingWeight = 0.01; constant float predatorWeight = 10.0;
➤ Then, add the new escaping function before boids: float2 escaping(Boid predator, Boid boid) { return -predatorWeight * (predator.position - boid.position) / average; }
You return the averaged position of neighboring boids relative to the predator position. The final result is then adjusted and negated because the escaping direction is the opposite of where the predator is located. ➤ At the top of boids, replace: Boid boid = boids[id];
➤ With the following code: Boid predator = boids[0]; Boid boid; if (id != 0) { boid = boids[id]; }
raywenderlich.com
453
Metal by Tutorials
Chapter 18: Particle Behavior
Here, you isolate the first boid in the buffer and label it as the predator. For the rest of the boids, you create a new boid object. ➤ Toward the end of boids, after defining color, add this: if (id == 0) { color = half4(1.0, 0.0, 0.0, 1.0); location = uint2(boids[0].position); }
The predator will stand out by coloring it red. You also save its current position. ➤ Before the line where you define location, add this: // 1 if (predator.position.x < 0 || predator.position.x > output.get_width()) { predator.velocity.x *= -1; } if (predator.position.y < 0 || predator.position.y > output.get_height()) { predator.velocity.y *= -1; } // 2 predator.position += predator.velocity / 2.0; boids[0] = predator;
With this code, you: 1. Check for collisions with the edges of the screen, and change the velocity when that happens. 2. Update the predator position with the current velocity, attenuated to half value to slow it down. Finally, save the predator position and velocity to preserve them for later use. ➤ Before the // velocity accumulation comment, add this: float2 escapingVector = escaping(predator, boid) * attenuation;
➤ Then, add this to update the velocity accumulation to include the escaping contribution: velocity += cohesionVector * cohesionWeight + separationVector * separationWeight + alignmentVector * alignmentWeight + escapingVector * escapingWeight;
raywenderlich.com
454
Metal by Tutorials
Chapter 18: Particle Behavior
➤ Build and run the app. Notice that some of the boids are steering away from the group and avoiding the predator.
Escaping boids
Dampening Dampening is the last steering behavior you’ll looking at in this chapter. Its purpose is to dampen the effect of the escaping behavior, because at some point, the predator will stop its pursuit. ➤ Add one more global constant to represent the weight for the dampening: constant float dampeningWeight = 1.0;
➤ Then, add the new dampening function before boids: float2 dampening(Boid boid) { // 1 float2 velocity = float2(0); // 2 if (abs(boid.velocity.x) > limit) { velocity.x += boid.velocity.x / abs(boid.velocity.x) * attenuation; } if (abs(boid.velocity.y) > limit) { velocity.y = boid.velocity.y / abs(boid.velocity.y) * attenuation; } return velocity; }
raywenderlich.com
455
Metal by Tutorials
Chapter 18: Particle Behavior
With this code, you: 1. Define and initialize the velocity variable. 2. Check if the velocity gets larger than the separation threshold. If it does, attenuate the velocity in the same direction. ➤ In boids, before the // velocity accumulation comment, add this: float2 dampeningVector = dampening(boid) * attenuation;
➤ Then, add this to update the velocity accumulation to include the dampening contribution: velocity += cohesionVector * cohesionWeight + separationVector * separationWeight + alignmentVector * alignmentWeight + escapingVector * escapingWeight + dampeningVector * dampeningWeight;
➤ Build and run the app. Notice the boids are staying together with the group again after the predator breaks pursuit.
Dampening keeps the boids together
raywenderlich.com
456
Metal by Tutorials
Chapter 18: Particle Behavior
Key Points • You can give particles behavioral animation by causing them to react with other particles • Swarming behavior has been widely researched. The Boids simulation describes basic movement rules. • The behavioral rules for boids include cohesion, separation and alignment. • Adding a predator to the particle mass requires an escaping algorithm.
Where to Go From Here? In this chapter, you learned how to construct basic behaviors and apply them to a small flock. Continue developing your project by adding a colorful background and textures for the boids. Or make it a 3D flocking app by adding projection to the scene. When you’re done, add the flock animation to your engine. Whatever you do, the sky is the limit. This chapter barely scratched the surface of what is widely known as behavioral animation. Be sure to review the references.markdown file in the chapter directory for links to more resources about this wonderful topic.
raywenderlich.com
457
Section III: Advanced Metal
In this section, you’ll learn many advanced features of Metal and explore realistic rendering techniques. You’ll animate characters, and also manage rendering your scenes on the GPU.
raywenderlich.com
458
19
Chapter 19: Tessellation & Terrains
So far, you’ve used normal map trickery in the fragment function to show the fine details of your low poly models. To achieve a similar level of detail without using normal maps requires a change of model geometry by adding more vertices. The problem with adding more vertices is that when you send them to the GPU, it chokes up the pipeline. A hardware tessellator in the GPU can create vertices on the fly, adding a greater level of detail and thereby using fewer resources. In this chapter, you’ll create a detailed terrain using a small number of points. You’ll send a flat ground plane with a grayscale texture describing the height, and the tessellator will create as many vertices as needed. The vertex function will then read the texture and displace (move) these new vertices vertically.
Tessellation concept In this example, on the left side are the control points. On the right side, the tessellator creates extra vertices, with the number dependent on how close the control points are to the camera. raywenderlich.com
459
Metal by Tutorials
Chapter 19: Tessellation & Terrains
Tessellation For tessellation, instead of sending vertices to the GPU, you send patches. These patches are made up of control points — a minimum of three for a triangle patch, or four for a quad patch. The tessellator can convert each quad patch into a certain number of triangles: up to 4,096 triangles on a recent iMac and 256 triangles on an iPhone that’s capable of tessellation. Note: Tessellation is available on all Macs since 2012 and on iOS 10 GPU Family 3 and up. This includes the iPhone 6s and newer devices. However, Tessellation is not available on the iOS simulator. With tessellation, you can: • Send less data to the GPU. Because the GPU doesn’t store tessellated vertices in graphics memory, it’s more efficient on resources. • Make low poly objects look less low poly by curving patches. • Displace vertices for fine detail instead of using normal maps to fake it. • Decide on the level of detail based on the distance from the camera. The closer an object is to the camera, the more vertices it contains.
The Starter Project So that you can more easily understand the difference between rendering patches and rendering vertices, the starter project is a simplified renderer. All the rendering code is in Renderer.swift, with the pipeline state setup in Pipelines.swift. Quad.swift contains the vertices and vertex buffer for the quad, and a method to generate control points. ➤ Open and run the starter project for this chapter.
raywenderlich.com
460
Metal by Tutorials
Chapter 19: Tessellation & Terrains
The code in this project is the minimum needed for a simple render of six vertices to create a quad.
The starter app Your task in this chapter is to convert this quad to a terrain made up of patch quads with many vertices. Before creating a tessellated terrain, you’ll tessellate a single four-point patch. Instead of sending six vertices to the GPU, you’ll send the positions of the four corners of the patch. You’ll give the GPU edge factors and inside factors which tell the tessellator how many vertices to create. You’ll render in wireframe line mode so you can see the vertices added by the tessellator, but you can change this with the Wireframe toggle in the app. To convert the quad, you’ll do the following on the CPU side:
CPU pipeline On the GPU side, you’ll set up a tessellation kernel that processes the edge and inside factors. You’ll also set up a post-tessellation vertex shader that handles the vertices generated by the hardware tessellator.
GPU pipeline
raywenderlich.com
461
Metal by Tutorials
Chapter 19: Tessellation & Terrains
Tessellation Patches A patch consists of a certain number of control points, generally: • bilinear: Four control points, one at each corner • biquadratic: Nine control points • bicubic: Sixteen control points
Tessellated patches The control points make up a cage which is made up of spline curves. A spline is a parametric curve made up of control points. There are various algorithms to interpolate these control points, but here, A, B and C are the control points. As point P travels from A to B, point Q travels from B to C. The half way point between P and Q describes the blue curve.
A bezier curve To create the curved patch surface, the vertex function interpolates vertices to follow this parametric curve. Note: Because the mathematics of curved surfaces is quite involved, you’ll work with only four control points per patch in this chapter.
raywenderlich.com
462
Metal by Tutorials
Chapter 19: Tessellation & Terrains
Tessellation Factors For each patch, you need to specify inside edge factors and outside edge factors. The four-point patch in the following image shows different edge factors for each edge — specified as [2, 4, 8, 16] — and two different inside factors — specified as [8, 16], for horizontal and vertical respectively.
Edge factors The edge factors specify how many segments an edge will be split into. An edge factor of 2 has two segments along the edge. For the inside factors, look at the horizontal and vertical center lines. In this example, the horizontal center has eight segments, and the vertical center has sixteen. Although only four control points (shown in red) went to the GPU, the hardware tessellator created a lot more vertices. However, creating more vertices on a flat plane doesn’t make the render any more interesting. Later, you’ll find out how to move these vertices around in the vertex function to make a bumpy terrain. But first, you’ll discover how to tessellate a single patch. ➤ In Renderer.swift, in Renderer, add the following code: let patches = (horizontal: 1, vertical: 1) var patchCount: Int { patches.horizontal * patches.vertical }
You create a constant for the number of patches you’re going to create, in this case, one. patchCount is a convenience property that returns the total number of patches. ➤ Next, add this: var edgeFactors: [Float] = [4] var insideFactors: [Float] = [4]
raywenderlich.com
463
Metal by Tutorials
Chapter 19: Tessellation & Terrains
Here, you set up the edge and inside factors as Float array properties. These variables indicate four segments along each edge, and four in the middle. You can specify different factors for different edges by adding them to the array. For each patch, the GPU processes these edge factors and places the amount to tessellate each into a buffer. ➤ Create a property to provide a buffer of the correct length: lazy var tessellationFactorsBuffer: MTLBuffer? = { // 1 let count = patchCount * (4 + 2) // 2 let size = count * MemoryLayout.size / 2 return Renderer.device.makeBuffer( length: size, options: .storageModePrivate) }()
1. count is the number of patches multiplied by the four edge factors and two inside factors. 2. Here you calculate the size of the buffer. In the tessellation kernel, you’ll fill the buffer with a special type consisting of half-floats. Now it’s time to set up the patch data.
Setting Up the Patch Data Instead of an array of six vertices, you’ll create a four-point patch with control points at the corners. Currently, in Quad.swift, Quad holds a vertexBuffer property that contains the vertices. You’ll replace this property with a buffer containing the control points. ➤ In Renderer, add the following property: var controlPointsBuffer: MTLBuffer?
➤ At the end of init(metalView:options:), fill the buffer with control points: let controlPoints = Quad.createControlPoints( patches: patches, size: (2, 2)) controlPointsBuffer = Renderer.device.makeBuffer( bytes: controlPoints,
raywenderlich.com
464
Metal by Tutorials
Chapter 19: Tessellation & Terrains
length: MemoryLayout.stride * controlPoints.count)
Quad.swift contains a method, createControlPoints(patches:size:). This method takes in the number of patches, and the unit size of the total number of patches. It then returns an array of xyz control points. Here, you create a patch with one corner at [-1, 0, 1], and the diagonal at [1, 0, -1]. This is a flat horizontal plane, but Renderer’s modelMatrix rotates the patch by 90º so you can see the patch vertices.
Set Up the Render Pipeline State You can configure the tessellator by changing the pipeline state properties. Until now, you’ve processed only vertices with the vertex descriptor. However, you’ll now modify the vertex descriptor so it processes patches instead. ➤ Open Pipelines.swift, and in createRenderPSO(colorPixelFormat:), where you set up vertexDescriptor, add this: vertexDescriptor.layouts[0].stepFunction = .perPatchControlPoint
With the old setup, you were using a default stepFunction of .perVertex. With that setup, the vertex function fetches new attribute data every time a new vertex is processed. Now that you’ve moved on to processing patches, you need to fetch new attribute data for every control point.
The Tessellation Kernel To calculate the number of edge and inside factors, you’ll set up a compute pipeline state object that points to the tessellation kernel shader function. ➤ Open Renderer.swift, and add a new property to Renderer: var tessellationPipelineState: MTLComputePipelineState
➤ In init(metalView:options:), before super.init(), add this: tessellationPipelineState = PipelineStates.createComputePSO(function: "tessellation_main")
Here, you instantiate the pipeline state for the compute pipeline. raywenderlich.com
465
Metal by Tutorials
Chapter 19: Tessellation & Terrains
Compute Pass You now have a compute pipeline state and an MTLBuffer containing the patch data. You also created an empty buffer which the tessellation kernel will fill with the edge and inside factors. Next, you need to create the compute command encoder to dispatch the tessellation kernel. ➤ In tessellation(commandBuffer:), add the following: guard let computeEncoder = commandBuffer.makeComputeCommandEncoder() else { return } computeEncoder.setComputePipelineState( tessellationPipelineState) computeEncoder.setBytes( &edgeFactors, length: MemoryLayout.size * edgeFactors.count, index: 0) computeEncoder.setBytes( &insideFactors, length: MemoryLayout.size * insideFactors.count, index: 1) computeEncoder.setBuffer( tessellationFactorsBuffer, offset: 0, index: 2) draw(in:) calls tessellation(commandBuffer:) before any rendering. You create
a compute command encoder and bind the edge and inside factors to the compute function (the tessellation kernel). If you have multiple patches, the compute function will operate in parallel on each patch, on different threads. ➤ To tell the GPU how many threads you need, continue by adding this after the previous code: let width = min( patchCount, tessellationPipelineState.threadExecutionWidth) let gridSize = MTLSize(width: patchCount, height: 1, depth: 1) let threadsPerThreadgroup = MTLSize(width: width, height: 1, depth: 1) computeEncoder.dispatchThreadgroups( gridSize, threadsPerThreadgroup: threadsPerThreadgroup) computeEncoder.endEncoding()
The compute grid is one dimensional, with a thread for each patch.
raywenderlich.com
466
Metal by Tutorials
Chapter 19: Tessellation & Terrains
Before changing the render encoder so it’ll draw patches instead of vertices, you’ll need to create the tessellation kernel.
The Tessellation Kernel Function ➤ Create a new Metal file named Tessellation.metal, and add this: #import "Common.h" kernel void tessellation_main( constant float *edge_factors [[buffer(0)]], constant float *inside_factors [[buffer(1)]], device MTLQuadTessellationFactorsHalf *factors [[buffer(2)]], uint pid [[thread_position_in_grid]]) { } kernel specifies the type of shader. The function operates on all threads (i.e., all
patches) and receives the three things you sent over: the edge factors, inside factors and the empty tessellation factors buffer that you’re going to fill in this function. The fourth parameter is the patch ID with its thread position in the grid. The tessellation factors buffer consists of an array of edge and inside factors for each patch, and pid gives you the patch index into this array. ➤ Inside the kernel function, add the following: factors[pid].edgeTessellationFactor[0] factors[pid].edgeTessellationFactor[1] factors[pid].edgeTessellationFactor[2] factors[pid].edgeTessellationFactor[3]
= = = =
edge_factors[0]; edge_factors[0]; edge_factors[0]; edge_factors[0];
factors[pid].insideTessellationFactor[0] = inside_factors[0]; factors[pid].insideTessellationFactor[1] = inside_factors[0];
This code fills in the tessellation factors buffer with the edge factors that you sent over. The edge and inside factors array you sent over only had one value each, so you put this value into all factors. Filling out a buffer with values is a trivial thing for a kernel to do, and you could do this on the CPU. However, as you get more patches and more complexity on how to tessellate these patches, you’ll understand why sending the data to the GPU for parallel processing is a useful step. After the compute pass is done, the render pass takes over. raywenderlich.com
467
Metal by Tutorials
Chapter 19: Tessellation & Terrains
The Render Pass Before doing the render, you need to tell the render encoder about the tessellation factors buffer that you updated during the compute pass. ➤ Open Renderer.swift, and in render(commandBuffer:view:), locate the // draw comment. Just after that comment, add this: renderEncoder.setTessellationFactorBuffer( tessellationFactorsBuffer, offset: 0, instanceStride: 0)
The post-tessellation vertex function reads from this buffer that you set up during the kernel function. Instead of drawing triangles from the vertex buffer, you’ll draw the patch using patch control points from the control points buffer. ➤ Replace: renderEncoder.setVertexBuffer( quad.vertexBuffer, offset: 0, index: 0)
With: renderEncoder.setVertexBuffer( controlPointsBuffer, offset: 0, index: 0)
➤ Replace the drawPrimitives command with this: renderEncoder.drawPatches( numberOfPatchControlPoints: 4, patchStart: 0, patchCount: patchCount, patchIndexBuffer: nil, patchIndexBufferOffset: 0, instanceCount: 1, baseInstance: 0)
The render command encoder tells the GPU that it’s going to draw one patch with four control points.
raywenderlich.com
468
Metal by Tutorials
Chapter 19: Tessellation & Terrains
The Post-Tessellation Vertex Function ➤ Open Shaders.metal. The GPU calls the vertex function after the tessellator has done its job of creating the vertices. The function will operate on each one of these new vertices. In the vertex function, you’ll tell each vertex what its position in the rendered quad should be. ➤ Rename VertexIn to ControlPoint. The definition of position remains the same. Because you used a vertex descriptor to describe the incoming control point data, you can use the [[stage_in]] attribute. The vertex function will check the vertex descriptor from the current pipeline state, find that the data is in buffer 0 and use the vertex descriptor layout to read in the data. ➤ Replace vertex_main with: // 1 [[patch(quad, 4)]] // 2 vertex VertexOut vertex_main( // 3 patch_control_point control_points [[stage_in]], // 4 constant Uniforms &uniforms [[buffer(BufferIndexUniforms)]], // 5 float2 patch_coord [[position_in_patch]]) { }
This is the post-tessellation vertex function where you return the correct position of the vertex for the rasterizer. Going through the code: 1. The function qualifier tells the vertex function that the vertices are coming from tessellated patches. It describes the type of patch, triangle or quad, and the number of control points, in this case, four. 2. The function is still a vertex shader function as before. 3. patch_control_point is part of the Metal Standard Library and provides the per-patch control point data. raywenderlich.com
469
Metal by Tutorials
Chapter 19: Tessellation & Terrains
4. Uniforms contains the model-view-projection matrix you passed in. 5. The tessellator provides a uv coordinate between 0 and 1 for the tessellated patch so that the vertex function can calculate its correct rendered position. To visualize how this works, you can temporarily return the UV coordinates as the position and color. ➤ Add the following code to vertex_main: float u = patch_coord.x; float v = patch_coord.y; VertexOut out; out.position = float4(u, v, 0, 1); out.color = float4(u, v, 0, 1); return out;
Here, you give the vertex a position as interpolated by the tessellator from the four patch positions, and a color of the same value for visualization. ➤ Build and run the app. See how the patch is tessellated with vertices between 0 and 1? (Normalized Device Coordinates (NDC) are between -1 and 1 which is why all the coordinates are at the top right.)
Basic tessellation
raywenderlich.com
470
Metal by Tutorials
Chapter 19: Tessellation & Terrains
To have your vertex positions depend on the patch’s actual position rather than between 0 and 1, you need to interpolate the patch’s control points depending on the UV values. ➤ In vertex_main, after assigning the u and v values, add this: float2 top = mix( control_points[0].position.xz, control_points[1].position.xz, u); float2 bottom = mix( control_points[3].position.xz, control_points[2].position.xz, u);
You interpolate values horizontally along the top of the patch and the bottom of the patch. Notice the index ordering: the patch indices are 0 to 3 clockwise.
Control point winding order You can change this by setting the pipeline descriptor property tessellationOutputWindingOrder to .counterClockwise. ➤ Change the following code: out.position = float4(u, v, 0, 1);
➤ To: float2 interpolated = mix(top, bottom, v); float4 position = float4( interpolated.x, 0.0, interpolated.y, 1.0); out.position = uniforms.mvp * position;
raywenderlich.com
471
Metal by Tutorials
Chapter 19: Tessellation & Terrains
You interpolate the vertical value between the top and bottom values and multiply it by the model-view-projection matrix to position the vertex in the scene. Currently, you’re leaving y at 0.0 to keep the patch two-dimensional. ➤ Build and run the app to see your tessellated patch:
A tessellated patch Note: Experiment with changing the edge and inside factors until you’re comfortable with how the tessellator subdivides. For example, change the edge factors array to [2, 4, 8, 16], and change the kernel function so that the appropriate array value goes into each edge.
Multiple Patches Now that you know how to tessellate one patch, you can tile the patches and choose edge factors that depend on dynamic factors, such as distance. ➤ Open Renderer.swift, and change the patches initialization to: let patches = (horizontal: 2, vertical: 2)
raywenderlich.com
472
Metal by Tutorials
Chapter 19: Tessellation & Terrains
➤ Build and run the app to see the four patches joined together:
Four tessellated patches In the vertex function, you can identify which patch you’re currently processing using the [[patch_id]] attribute. ➤ Open Shaders.metal, and add this parameter to vertex_main: uint patchID [[patch_id]]
➤ Change the assignment of out.color to: out.color = float4(0); if (patchID == 0) { out.color = float4(1, 0, 0, 1); }
➤ Build and run the app.
Colored by patch id raywenderlich.com
473
Metal by Tutorials
Chapter 19: Tessellation & Terrains
Notice how the GPU colors the bottom left patch red. This is the first patch in the control points array.
Tessellation By Distance In this section, you’re going to create a terrain with patches that are tessellated according to the distance from the camera. When you’re close to a mountain, you need to see more detail; when you’re farther away, less. Having the ability to dial in the level of detail is where tessellation comes into its own. By setting the level of detail, you save on how many vertices the GPU has to process in any given situation. ➤ Open Common.h, and add the following code: typedef struct { vector_float2 size; float height; uint maxTessellation; } Terrain;
You set up a new structure to describe the size and maximum tessellation of the terrain. You’ll use height for scaling vertices on the y-axis later. ➤ Open Renderer.swift, and add a new constant to Renderer: static let maxTessellation = 16
This value is the maximum amount you can tessellate per patch. On iOS devices, currently the maximum amount is 16, but on new Macs, the maximum is 64. ➤ Add a new property: var terrain = Terrain( size: [2, 2], height: 1, maxTessellation: UInt32(Renderer.maxTessellation))
You describe the terrain with four patches and a maximum height of one unit.
raywenderlich.com
474
Metal by Tutorials
Chapter 19: Tessellation & Terrains
➤ Locate where you set up controlPoints in init(metalView:), and change it to: let controlPoints = Quad.createControlPoints( patches: patches, size: (width: terrain.size.x, height: terrain.size.y))
Because your terrains are going to be much larger, you’ll use the terrain constant to create the control points. To calculate edge factors that are dependent on the distance from the camera, you will send the camera position, model matrix and control points to the kernel. ➤ In tessellation(commandBuffer:), before you set the width of the compute threads with let width = min(patchCount..., add this: var cameraPosition = float4(camera.position, 0) computeEncoder.setBytes( &cameraPosition, length: MemoryLayout.stride, index: 3) var matrix = modelMatrix computeEncoder.setBytes( &matrix, length: MemoryLayout.stride, index: 4) computeEncoder.setBuffer( controlPointsBuffer, offset: 0, index: 5) computeEncoder.setBytes( &terrain, length: MemoryLayout.stride, index: 6)
You send the camera position, along with the model matrix, the control points buffer and the terrain information to the tessellation kernel. ➤ Open Tessellation.metal, and add these parameters to tessellation_main: constant constant constant constant
float4 &camera_position float4x4 &modelMatrix float3* control_points Terrain &terrain
raywenderlich.com
[[buffer(3)]], [[buffer(4)]], [[buffer(5)]], [[buffer(6)]],
475
Metal by Tutorials
Chapter 19: Tessellation & Terrains
With these constants, you can compute the distance of the edges from the camera. You’ll set the edge and inside tessellation factors differently for each patch edge instead of sending a constant 4 for all of the edges. The further the patch edge is from the camera, the lower the tessellation on that edge. These are the edge and control point orders for each patch:
Edges and control points To calculate the tessellation of an edge, you need to know the transformed mid-point of two control points. To calculate edge 2, for example, you get the midpoint of points 1 and 2 and find out the distance of that point from the camera. Where two patches join, it’s imperative to keep the tessellation level for the joined edges the same, otherwise you get cracks. By calculating the distance of the mid-point, you end up with the same result for the overlapping edges. ➤ In Tessellation.metal, create a new function before tessellation_main: float calc_distance( float3 pointA, float3 pointB, float3 camera_position, float4x4 modelMatrix) { float3 positionA = (modelMatrix * float4(pointA, 1)).xyz; float3 positionB = (modelMatrix * float4(pointB, 1)).xyz; float3 midpoint = (positionA + positionB) * 0.5;
}
float camera_distance = distance(camera_position, midpoint); return camera_distance;
raywenderlich.com
476
Metal by Tutorials
Chapter 19: Tessellation & Terrains
This function takes in two points: The camera position and the model matrix. The function then finds the mid-point between the two points and calculates the distance from the camera. ➤ Remove all of the code from tessellation_main. ➤ Add the following line of code to tessellation_main to calculate the correct index into the tessellation factors array: uint index = pid * 4; 4 is the number of control points per patch, and pid is the patch ID. To index into
the control points array for each patch, you skip over four control points at a time. ➤ Add this line to keep a running total of tessellation factors: float totalTessellation = 0;
➤ Add a for loop for each of the edges: for (int i = 0; i int pointAIndex int pointBIndex if (pointAIndex pointBIndex = } int edgeIndex = }
< 4; i++) { = i; = i + 1; == 3) { 0; pointBIndex;
You cycle around four corners: 0, 1, 2, 3. On the first iteration, you calculate edge 1 from the mid-point of points 0 and 1. On the fourth iteration, you use points 3 and 0 to calculate edge 0. ➤ At the end of the for loop, call the distance calculation function: float cameraDistance = calc_distance( control_points[pointAIndex + index], control_points[pointBIndex + index], camera_position.xyz, modelMatrix);
raywenderlich.com
477
Metal by Tutorials
Chapter 19: Tessellation & Terrains
➤ Then, still inside the for loop, set the tessellation factor for the current edge: float tessellation = max(4.0, terrain.maxTessellation / cameraDistance); factors[pid].edgeTessellationFactor[edgeIndex] = tessellation; totalTessellation += tessellation;
You set a minimum edge factor of 4. The maximum depends upon the camera distance and the maximum tessellation amount you specified for the terrain. ➤ After the for loop, add this: factors[pid].insideTessellationFactor[0] = totalTessellation * 0.25; factors[pid].insideTessellationFactor[1] = totalTessellation * 0.25;
You set the two inside tessellation factors to be an average of the total tessellation for the patch. You’ve now finished creating the compute kernel which calculates the edge factors based on distance from the camera. Lastly, you’ll revise some of the default render pipeline state tessellation parameters. ➤ Open Pipelines.swift, and in createRenderPSO(colorPixelFormat:), add this before the return: // 1 pipelineDescriptor.tessellationFactorStepFunction = .perPatch // 2 pipelineDescriptor.maxTessellationFactor = Renderer.maxTessellation // 3 pipelineDescriptor.tessellationPartitionMode = .fractionalEven
1. The step function was previously set to a default .constant, which sets the same edge factors on all patches. By setting this to .perPatch, the vertex function uses each patch’s edge and inside factors information in the tessellation factors array. 2. You set the maximum number of segments per patch for the tessellator. 3. The partition mode describes how these segments are split up. The default is .pow2, which rounds up to the nearest power of two. Using .fractionalEven, the tessellator rounds up to the nearest even integer, so it allows for much more variation of tessellation. ➤ Build and run the app, and rotate and zoom your patches.
raywenderlich.com
478
Metal by Tutorials
Chapter 19: Tessellation & Terrains
As you reposition the patches, the tessellator recalculates their distance from the camera and tessellates accordingly. Tessellating is a neat superpower!
Tessellation by distance Check where the patches join. The triangles of each side of the patch should connect. Now that you’ve mastered tessellation, you’ll be able to add detail to your terrain.
Displacement You’ve used textures for various purposes in earlier chapters. Now you’ll use a height map to change the height of each vertex. Height maps are grayscale images where you can use the texel value for the Y vertex position, with white being high and black being low. There are several height maps in Textures.xcassets you can experiment with. ➤ Open Renderer.swift, and create a property to hold the height map: let heightMap: MTLTexture!
➤ In init(metalView:), before calling super.init(), initialize heightMap: do { heightMap = try TextureController.loadTexture(filename: "mountain") } catch { fatalError(error.localizedDescription) }
raywenderlich.com
479
Metal by Tutorials
Chapter 19: Tessellation & Terrains
Here, you load the height map texture from the asset catalog. ➤ In render(commandBuffer:view:), add the following code before the draw call renderEncoder.drawPatches(...): renderEncoder.setVertexTexture(heightMap, index: 0) renderEncoder.setVertexBytes( &terrain, length: MemoryLayout.stride, index: 6)
You’re already familiar with sending textures to the fragment shader, which makes the texture available to the vertex shader in the same way. You also send the terrain setup details. ➤ Open Shaders.metal, and add the following to vertex_main’s parameters: texture2d heightMap [[texture(0)]], constant Terrain &terrain [[buffer(6)]],
You read in the texture and terrain information. You’re currently only using the x and z position coordinates for the patch and leaving the y coordinate as zero. You’ll now map the y coordinate to the height indicated in the texture. Just as you used u and v fragment values to read the appropriate texel in the fragment function, you use the x and z position coordinates to read the texel from the height map in the vertex function. ➤ After setting position, but before multiplying by the model-view-projection matrix, add this: // 1 float2 xy = (position.xz + terrain.size / 2.0) / terrain.size; // 2 constexpr sampler sample; float4 color = heightMap.sample(sample, xy); out.color = float4(color.r); // 3 float height = (color.r * 2 - 1) * terrain.height; position.y = height;
raywenderlich.com
480
Metal by Tutorials
Chapter 19: Tessellation & Terrains
Going through the code: 1. You convert the patch control point values to be between 0 and 1 to be able to sample the height map. You include the terrain size because, although your patch control points are currently between -1 and 1, soon you’ll be making a larger terrain. 2. Create a default sampler and read the texture as you have done previously in the fragment function. The texture is a grayscale texture, so you only use the .r value. 3. color is between 0 and 1, so for the height, shift the value to be between -1 and 1, and multiply it by your terrain height scale setting. This is currently set to 1. ➤ Next, remove the following code from the end of the vertex function, because you’re now using the color of the height map. out.color = float4(0); if (patchID == 0) { out.color = float4(1, 0, 0, 1); }
➤ Open Renderer.swift, and change the rotation property initialization in modelMatrix to: let rotation = float3(Float(-20).degreesToRadians, 0, 0)
You’ll now look at your tessellated plane from the side, rather than from the top. ➤ Build and run the app to see the height map displacing the vertices. Notice how the white vertices are high and the black ones are low:
Height map displacement
raywenderlich.com
481
Metal by Tutorials
Chapter 19: Tessellation & Terrains
This render doesn’t yet have much detail, but that’s about to change. ➤ In Renderer.swift, change the maxTessellation constant to: static let maxTessellation: Int = { #if os(macOS) return 64 #else return 16 #endif }()
These are the maximum values for each OS. Because the maximum tessellation on iOS is so low, you may want to increase the number of patches rendered on iOS. ➤ Change patches and terrain to: let patches = (horizontal: 6, vertical: 6) var terrain = Terrain( size: [8, 8], height: 1, maxTessellation: UInt32(Renderer.maxTessellation))
This time, you’re creating thirty-six patches over sixteen units. ➤ Build and run the app to see your patch height-mapped into a magnificent mountain. Don’t forget to click off the wireframe option to see your mountain render in its full glory.
A tessellated mountain Now it’s time to render your mountain with different colors and textures depending on height.
raywenderlich.com
482
Metal by Tutorials
Chapter 19: Tessellation & Terrains
Shading By Height In the previous section, you sampled the height map in the vertex function, and the colors are interpolated when sent to the fragment function. For maximum color detail, you need to sample from textures per fragment, not per vertex. For that to work, you’ll set up three textures: snow, cliff and grass. You’ll send these textures to the fragment function and test the height there. ➤ Open Renderer.swift, and add three new texture properties to Renderer: let cliffTexture: MTLTexture? let snowTexture: MTLTexture? let grassTexture: MTLTexture?
➤ In init(metalView:), in the do closure where you create the height map, add this: cliffTexture = try TextureController.loadTexture(filename: "cliff-color") snowTexture = try TextureController.loadTexture(filename: "snow-color") grassTexture = try TextureController.loadTexture(filename: "grass-color")
These textures are in the asset catalog. ➤ To send the textures to the fragment function, in render(commandBuffer:view:), add the following code before the renderEncoder draw call: renderEncoder.setFragmentTexture(cliffTexture, index: 1) renderEncoder.setFragmentTexture(snowTexture, index: 2) renderEncoder.setFragmentTexture(grassTexture, index: 3)
➤ Open Shaders.metal, and add two new properties to VertexOut: float height; float2 uv;
➤ At the end of vertex_main, before the return, set the value of these two properties: out.uv = xy; out.height = height;
raywenderlich.com
483
Metal by Tutorials
Chapter 19: Tessellation & Terrains
You send the height value from the vertex function to the fragment function so that you can assign fragments the correct texture for that height. ➤ Add the textures to the fragment_main parameters: texture2d cliffTexture [[texture(1)]], texture2d snowTexture [[texture(2)]], texture2d grassTexture [[texture(3)]]
➤ Replace the contents of fragment_main with: constexpr sampler sample(filter::linear, address::repeat); float tiling = 16.0; float4 color; if (in.height < -0.5) { color = grassTexture.sample(sample, in.uv * tiling); } else if (in.height < 0.3) { color = cliffTexture.sample(sample, in.uv * tiling); } else { color = snowTexture.sample(sample, in.uv * tiling); } return color;
You create a tileable texture sampler and read in the appropriate texture for the height. Height is between -1 and 1, as set in vertex_main. You then tile the texture by 16 — an arbitrary value based on what looks best here. ➤ Build and run the app. Click the wireframe toggle to see your textured mountain. You have grass at low altitudes and snowy peaks at high altitudes.
A textured mountain As you zoom and rotate, notice how the mountain seems to ripple. This is the tessellation level of detail being over-sensitive. One way of dialing this down is to change the render pass’s tessellation partition mode.
raywenderlich.com
484
Metal by Tutorials
Chapter 19: Tessellation & Terrains
➤ Open Pipelines.swift, and in createRenderPSO(colorPixelFormat:), change the pipelineDescriptor.tessellationPartitionMode assignment to: pipelineDescriptor.tessellationPartitionMode = .pow2
➤ Build and run the app. As the tessellator rounds up the edge factors to a power of two, there’s a larger difference in tessellation between the patches now, but the change in tessellation won’t occur so frequently, and the ripple disappears.
Rounding edge factors to a power of two
Shading By Slope The snow line in your previous render is unrealistic. By checking the slope of the mountain, you can show the snow texture in flatter areas, and show the cliff texture where the slope is steep. An easy way to calculate slope is to run a Sobel filter on the height map. A Sobel filter is an algorithm that looks at the gradients between neighboring pixels in an image. It’s useful for edge detection in computer vision and image processing, but in this case, you can use the gradient to determine the slope between neighboring pixels.
Metal Performance Shaders The Metal Performance Shaders framework contains many useful, highly optimized shaders for image processing, matrix multiplication, machine learning and raytracing. You’ll read more about them in Chapter 30, “Metal Performance Shaders.” The shader you’ll use here is MPSImageSobel, which takes a source image texture and outputs the filtered image into a new grayscale texture. The whiter the pixel, the steeper the slope. raywenderlich.com
485
Metal by Tutorials
Chapter 19: Tessellation & Terrains
Note: In the challenge for this chapter, you’ll use the Sobel-filtered image and apply the three textures to your mountain depending on slope. ➤ Open Renderer.swift, and import the Metal Performance Shaders framework: import MetalPerformanceShaders
➤ Create a new method in Renderer to process the height map: static func heightToSlope(source: MTLTexture) -> MTLTexture { }
Next, you’ll send the height map to this method and return a new texture. To create the new texture, you first need to create a texture descriptor where you can assign the size, pixel format and tell the GPU how you will use the texture. ➤ Add this to heightToSlope(source:): let descriptor = MTLTextureDescriptor.texture2DDescriptor( pixelFormat: source.pixelFormat, width: source.width, height: source.height, mipmapped: false) descriptor.usage = [.shaderWrite, .shaderRead]
You create a descriptor for textures that you want to both read and write. You’ll write to the texture in the MPS shader and read it in the fragment shader. ➤ Continue adding to the method: guard let destination = Renderer.device.makeTexture(descriptor: descriptor), let commandBuffer = Renderer.commandQueue.makeCommandBuffer() else { fatalError("Error creating Sobel texture") }
This creates the texture and the command buffer for the MPS shader. ➤ Now, add this: let shader = MPSImageSobel(device: Renderer.device) shader.encode( commandBuffer: commandBuffer, sourceTexture: source,
raywenderlich.com
486
Metal by Tutorials
Chapter 19: Tessellation & Terrains
destinationTexture: destination) commandBuffer.commit() return destination
You run the MPS shader and return the texture. That’s all there is to running a Metal Performance Shader on a texture. Note: The height maps in the asset catalog have a pixel format of 8 Bit Normalized - R, or R8Unorm. Using the default pixel format of RGBA8Unorm with MPSImageSobel crashes. In any case, for grayscale texture maps that only use one channel, using R8Unorm as a pixel format is more efficient. ➤ To hold the terrain slope in a texture, add a new property to Renderer: let terrainSlope: MTLTexture
➤ In init(metalView:), before calling super.init(), initialize the texture: terrainSlope = Renderer.heightToSlope(source: heightMap)
The texture when created will look like this:
The Sobel filter In the challenge, once you send this texture to the vertex shader, you’ll be able to see it using the Capture GPU Frame icon. The white parts are the steep slopes.
raywenderlich.com
487
Metal by Tutorials
Chapter 19: Tessellation & Terrains
Challenge Your challenge for this chapter is to use the slope texture from the Sobel filter to place snow on the mountain on the parts that aren’t steep. Because you don’t need pixel perfect accuracy, you can read the slope image in the vertex function and send that value to the fragment function. This is more efficient as there will be fewer texture reads in the vertex function than in the fragment function. If everything goes well, you’ll render an image like this:
Shading by slope Notice how the grass blends into the mountain. This is done using the mix() function. Currently, you have three zones — the heights where you render the three different textures. The challenge project has four zones: • grass: < -0.6 in height • grass blended with mountain: -0.6 to -0.4 • mountain: -0.4 to -0.2 • mountain with snow on the flat parts: > -0.2 See if you can get your mountain to look like the challenge project in the projects directory for this chapter.
raywenderlich.com
488
Metal by Tutorials
Chapter 19: Tessellation & Terrains
Key Points • Tessellation utilizes a tessellator chip on the GPU to create extra vertices. • You send patches to the GPU rather than vertices. The tessellator then breaks down these patches to smaller triangles. • A patch can be either a triangle or a quad. • The tessellation pipeline has an extra stage of setting edge and inside factors in a tessellation kernel. These factors decide the number of vertices that the tessellator should create. • The vertex shader handles the vertices created by the tessellator. • Vertex displacement uses a grayscale texture to move the vertex, generally in the y direction. • The Sobel Metal Performance Shader takes a texture and generates a new texture that defines the slope of a pixel.
raywenderlich.com
489
Metal by Tutorials
Chapter 19: Tessellation & Terrains
Where to Go From Here? With very steep displacement, there can be lots of texture stretching between vertices. There are various algorithms to overcome this, and you can find one in Apple’s excellent sample code: Dynamic Terrain with Argument Buffers at https:// developer.apple.com/documentation/metal/fundamental_components/ gpu_resources/dynamic_terrain_with_argument_buffers. This is a complex project that showcases argument buffers, but the dynamic terrain portion is interesting. There’s another way to do blending. Instead of using mix(), the way you did in the challenge, you can use a texture map to define the different regions. This is known as texture splatting. You create a splat map with the red, blue and green channels describing up to three textures and where to use them.
A splat map With all of the techniques for reading and using textures that you’ve learned so far, texture splatting shouldn’t be too difficult for you to implement.
raywenderlich.com
490
20
Chapter 20: Fragment Post-Processing
After the fragments are processed in the pipeline, a series of operations run on the GPU. These operations are sometimes referred to as Per-sample Processing (https://www.khronos.org/opengl/wiki/Per-Sample_Processing) and include: alpha testing, depth testing, stencil testing, scissor testing, blending and anti-aliasing. You’ve already encountered a few of these operations in earlier chapters, such as depth testing and stencil testing. Now it’s time to revisit those concepts while also learning about the others.
raywenderlich.com
491
Metal by Tutorials
Chapter 20: Fragment Post-Processing
The Starter App ➤ In Xcode, open the starter app for this chapter, and build and run the app.
The starter app The standard forward renderer renders the scene using the PBR shader. This scene has a tree and ground plane, along with an extra window model that you’ll add later in this chapter. You can use the options at the top-left of the screen to toggle the post-processing effects. Those effects aren’t active yet, but they will be soon! Submesh and Model now accepts an optional texture to use as an opacity map. Later
in this chapter, you’ll update the PBR shader function to take into account a model’s opacity. If you need help adding textures to your renderer, review Chapter 11, “Maps & Materials”.
Using Booleans in a C Header File In Renderer.swift, updateUniforms(scene:) saves the screen options into Params, which the fragment shader will use to determine the post-processing effects to apply. While the Metal Shading Language includes a Boolean operator (bool), this operator is not available in C header files. In the Shaders group included with this starter project, is stdbool.h. This file defines a bool, which Common.h imports. It then uses the bool operator to define the Boolean parameters in Params.
raywenderlich.com
492
Metal by Tutorials
Chapter 20: Fragment Post-Processing
Alpha Testing Move closer to the tree using the scroll wheel or the two-finger gesture on your trackpad, and you’ll notice the leaves look a little odd.
Opaque edge around leaves ➤ Open Textures.xcassets, and select tree-color. Preview the texture by pressing the space bar.
Tree-color texture
raywenderlich.com
493
Metal by Tutorials
Chapter 20: Fragment Post-Processing
The area of the texture surrounding the leaf is transparent, yet it renders as either white or black, depending on the device. To make the leaves look more natural, you’ll render the transparent part of the texture as transparent in the scene. However, before making this change, it’s important to understand the difference between transparent, translucent, and opaque objects. A transparent object allows light to entirely pass through it. A translucent object distorts light as it passes through it. An opaque object does not allow any light to pass through it. Most objects in nature are opaque. Objects like water, glass and plastic are translucent. Digital colors are formed using a combination of the three primary colors: red, green and blue — hence the color scheme RGB. However, there’s a fourth component you can add to the color definition: alpha. Alpha ranges from 0 (fully transparent) to 1 (fully opaque). A common practice in determining transparency is to check the alpha property and ignore values below a certain threshold. This technique is known as alpha testing. ➤ Open PBR.metal. In fragment_PBR, locate the conditional closure where you set material.baseColor. ➤ Replace the contents of if (!is_null_texture(baseColorTexture)) {} with: float4 color = baseColorTexture.sample( textureSampler, in.uv * params.tiling); if (params.alphaTesting && color.a < 0.1) { discard_fragment(); return 0; } material.baseColor = color.rgb;
With this change, you now read in the alpha value of the color as well as the RGB values. If the alpha is less than a 0.1 threshold, and you’re performing alpha testing, then you discard the fragment. The GPU ignores the returned 0 and stops processing the fragment.
raywenderlich.com
494
Metal by Tutorials
Chapter 20: Fragment Post-Processing
➤ Build and run the app, and toggle Alpha Testing to see the difference.
Alpha testing That’s much better! Now, when you get closer to the tree, you’ll notice the white background around the leaves is gone.
Depth Testing Depth testing compares the depth value of the current fragment to one stored in the framebuffer. If a fragment is farther away than the current depth value, this fragment fails the depth test and is discarded since it’s occluded by another fragment. You learned about depth testing in Chapter 7, “The Fragment Function”.
Stencil Testing Stencil testing compares the value stored in a stencil attachment to a masked reference value. If a fragment makes it through the mask it’s kept, otherwise it’s discarded. You learned about stencil testing in Chapter 15, “Tile-Based Deferred Rendering”.
raywenderlich.com
495
Metal by Tutorials
Chapter 20: Fragment Post-Processing
Scissor Testing If you only want to render part of the screen, you can tell the GPU to render only within a particular rectangle. This is much more efficient than rendering the entire screen. The scissor test checks whether a fragment is inside a defined 2D area known as the scissor rectangle. If the fragment falls outside of this rectangle, it’s discarded. ➤ Open ForwardRenderPass.swift, which is where you set up your render command encoder to draw the models. ➤ In draw(commandBuffer:scene:uniforms:params:), before for model in scene.models, add this: if params.scissorTesting { let marginWidth = Int(params.width) / 4 let marginHeight = Int(params.height) / 4 let width = Int(params.width) / 2 let height = Int(params.height) / 2 let rect = MTLScissorRect( x: marginWidth, y: marginHeight, width: width, height: height) renderEncoder.setScissorRect(rect) }
Here, you set the scissor rectangle to half the width and height of the current metal view. ➤ Build and run the app, and turn on Scissor Testing.
Scissor testing Keep in mind that any objects rendered before you set the scissor rectangle are not affected. This means that you can choose to render within a scissor rectangle only for selected models. raywenderlich.com
496
Metal by Tutorials
Chapter 20: Fragment Post-Processing
Alpha Blending Alpha blending is different from alpha testing in that the latter only works with total transparency. In that case, all you have to do is discard fragments. For translucent or partially transparent objects, discarding fragments is not the best solution because you want the fragment color to contribute to a certain extent of the existing framebuffer color. You don’t just want to replace it. You had a taste of blending in Chapter 14, “Deferred Rendering”, when you blended the result of your point lights. The formula for alpha blending is as follows:
Going over this formula: • Cs: Source color. The current color you just added to the scene. • Cd: Destination color. The color that already exists in the framebuffer. • Cb: Final blended color. • ⍺1 and ⍺2: The alpha (opacity) factors for the source and destination color, respectively. The final blended color is the result of adding the products between the two colors and their opacity factors. The source color is the fragment color you put in front, and the destination color is the color already existing in the framebuffer. Often the two factors are the inverse of each other transforming this equation into linear color interpolation:
All right, time to install a glass window in front of the tree. ➤ Open GameScene.swift. In init(), change models = [ground, tree] to: window.position = [0, 3, -1] models = [window, ground, tree]
raywenderlich.com
497
Metal by Tutorials
Chapter 20: Fragment Post-Processing
➤ Build and run the app, and you’ll see the window:
The window in the scene You can’t yet view the tree through the window, but you’ll fix that with blending. There are two ways to work with blending: the programmable way and the fixedfunction way. You used programmable blending with color attachments in Chapter 15, “Tile-Based Deferred Rendering”. In this chapter, you’ll use fixed-function blending.
Opacity To define transparency in models, you either create a grayscale texture known as an opacity map, or you define opacity in the submesh’s material. The window’s glass group has an opacity map where white means fully opaque, and black means fully transparent.
The window's opacity map
raywenderlich.com
498
Metal by Tutorials
Chapter 20: Fragment Post-Processing
Blending To implement blending, you need a second pipeline state in your render pass. You’ll still use the same shader functions, but you’ll turn on blending in the GPU. ➤ Open Pipelines.swift, and copy createForwardPSO() to a new method. ➤ Rename the new method to createForwardTransparentPSO(). ➤ In createForwardTransparentPSO(), after setting pipelineDescriptor.colorAttachments[0].pixelFormat, add this: // 1 let attachment = pipelineDescriptor.colorAttachments[0] // 2 attachment?.isBlendingEnabled = true // 3 attachment?.rgbBlendOperation = .add // 4 attachment?.sourceRGBBlendFactor = .sourceAlpha // 5 attachment?.destinationRGBBlendFactor = .oneMinusSourceAlpha
With this code, you: 1. Grab the first color attachment from the render pipeline descriptor. The color attachment is a color render target that specifies the color configuration and color operations associated with a render pipeline. The render target holds the drawable texture where the rendering output goes. 2. Enable blending on the attachment. 3. Specify the blending type of operation used for color. Blend operations determine how a source fragment is combined with a destination value in a color attachment to determine the pixel value to be written. 4. Specify the blend factor used by the source color. A blend factor is how much the color will contribute to the final blended color. If not specified, this value is always 1 (.one) by default. 5. Specify the blend factor used by the destination color. If not specified, this value is always 0 (.zero) by default.
raywenderlich.com
499
Metal by Tutorials
Chapter 20: Fragment Post-Processing
Note: There are quite a few blend factors available to use other than sourceAlpha and oneMinusSourceAlpha. For a complete list of options, consult Apple’s official page for Blend Factors (https://developer.apple.com/ documentation/metal/mtlblendfactor). ➤ Open ForwardRenderPass.swift, and add a new property to ForwardRenderPass: var transparentPSO: MTLRenderPipelineState
➤ In init(view:), add this: transparentPSO = PipelineStates.createForwardTransparentPSO()
You initialized the new pipeline state object with your new pipeline creation method. ➤ In draw(commandBuffer:scene:uniforms:params:), replace renderEncoder.setRenderPipelineState(pipelineState) with: renderEncoder.setRenderPipelineState(transparentPSO)
You temporarily replace the pipeline state with your new one. Blending is always enabled now. ➤ Open PBR.metal. In fragment_PBR, after the conditional where you set material.baseColor, add this: if (params.alphaBlending) { if (!is_null_texture(opacityTexture)) { material.opacity = opacityTexture.sample(textureSampler, in.uv).r; } }
If you have the alpha blending option turned on, read the value from a provided opacity texture. If no texture is provided, you’ll use the default from material, loaded with the model’s submesh. ➤ At the end of fragment_PBR, replace the return value with: return float4(diffuseColor + specularColor, material.opacity);
You return the opacity value in the alpha channel.
raywenderlich.com
500
Metal by Tutorials
Chapter 20: Fragment Post-Processing
➤ Build and run the app.
Opacity not working Even though you’ve set blending in the pipeline state and changed the opacity in the fragment function, the opacity doesn’t appear to be working. The glass has changed color from the previous render. This indicates that the transparency is actually working. It’s showing the view’s clear color through the glass, rather than the tree and the ground. ➤ Open GameScene.swift. In init(), change models = [window, ground, tree] to: models = [ground, tree, window]
➤ Build and run the app.
Opacity is working The opacity is working, and if you zoom in, you can see the weathering on the old glass.
raywenderlich.com
501
Metal by Tutorials
Chapter 20: Fragment Post-Processing
Transparent Mesh Rendering Order The blending order is important. Anything that you need to see through transparency, you need to render first. However, it may not always be convenient to work out exactly which models require blending. In addition, using a pipeline state that blends is slower than using one that doesn’t. ➤ Undo the previous change to models so that you render the window first again. First set up your models to indicate whether any of the submeshes aren’t opaque. ➤ Open Submesh.swift, and add a computed property to Submesh: var transparency: Bool { return textures.opacity != nil || material.opacity < 1.0 } transparency is true if the submesh textures or material indicate transparency.
➤ Open Model.swift, and add a new property: let hasTransparency: Bool
To initialize this property, you’ll process all of the model’s submeshes, and if any of them have transparency set to true, then the model is not fully opaque. ➤ At the end of init(name:), add this: hasTransparency = meshes.contains { mesh in mesh.submeshes.contains { $0.transparency } }
If any of the model’s submeshes have transparency, you’ll process the model during the transparency render phase. ➤ In render(encoder:uniforms:params:), locate for submesh in mesh.submeshes. ➤ At the top of the for loop, add this: if submesh.transparency != params.transparency { continue }
You only render the submesh if its transparency matches the current transparency in the render loop.
raywenderlich.com
502
Metal by Tutorials
Chapter 20: Fragment Post-Processing
➤ Open Common.h, and add a new property to Params: bool transparency;
You’ll use this property to track when you’re currently rendering transparent submeshes. ➤ Open ForwardRenderPass.swift. In draw(commandBuffer:scene:uniforms:params:), change the pipeline state back
to what it was originally: renderEncoder.setRenderPipelineState(pipelineState)
➤ Locate for model in scene.models. Before the for loop, add this: var params = params params.transparency = false
In the render loop, you’ll only render submeshes with no transparency. ➤ Now, before renderEncoder.endEncoding(), add this: // transparent mesh renderEncoder.pushDebugGroup("Transparency") let models = scene.models.filter { $0.hasTransparency } params.transparency = true if params.alphaBlending { renderEncoder.setRenderPipelineState(transparentPSO) } for model in models { model.render( encoder: renderEncoder, uniforms: uniforms, params: params) } renderEncoder.popDebugGroup()
Here, you filter the scene models array to find only those models that have a transparent submesh. You then change the pipeline state to use alpha blending, and render the filtered models.
raywenderlich.com
503
Metal by Tutorials
Chapter 20: Fragment Post-Processing
➤ Build and run the app.
Alpha blending You can now see through your window. Note: If you have several transparent meshes overlaying each other, you’ll need to sort them to ensure that you render them in strict order from back to front. ➤ In the app, turn off Alpha Blending. At the end of the render loop, the pipeline state doesn’t switch to the blending one, so the window becomes opaque again.
Alpha blending turned off
raywenderlich.com
504
Metal by Tutorials
Chapter 20: Fragment Post-Processing
Antialiasing Often, rendered models show slightly jagged edges that are visible when you zoom in. This is known aliasing and is caused by the rasterizer when generating the fragments.
Rasterizing a triangle If you look at the edges of a triangle — or any straight line with a slope — you’ll notice the line doesn’t always go precisely through the center of a pixel. Some pixels are colored above the line and some below it. The solution to fixing aliasing is to use antialiasing. Antialiasing applies techniques to render smoother edges. By default, the pipeline uses one sample point (subpixel) for each pixel that is close to the line to determine if they meet. However, it’s possible to use four or more points for increased accuracy of intersection determination. This is known as Multisample Antialiasing (MSAA), and it’s more expensive to compute. Next, you’re going to configure the fixed-function MSAA on the pipeline and enable antialiasing on both the tree and the window. ➤ Open Pipelines.swift, and duplicate both createForwardPSO() and createForwardTransparentPSO(). ➤ Name them createForwardPSO_MSAA() and createForwardTransparentPSO_MSAA(), respectively. ➤ In both new methods, before the return, add this: pipelineDescriptor.sampleCount = 4
➤ Open ForwardRenderPass.swift, and add two new properties: var pipelineState_MSAA: MTLRenderPipelineState var transparentPSO_MSAA: MTLRenderPipelineState
raywenderlich.com
505
Metal by Tutorials
Chapter 20: Fragment Post-Processing
➤ In init(), initialize the new pipeline states with your new pipeline state creation methods: pipelineState_MSAA = PipelineStates.createForwardPSO_MSAA() transparentPSO_MSAA = PipelineStates.createForwardTransparentPSO_MSAA()
➤ At the top of draw(commandBuffer:scene:uniforms:params:), add this: let pipelineState = params.antialiasing ? pipelineState_MSAA : pipelineState let transparentPSO = params.antialiasing ? transparentPSO_MSAA : transparentPSO
Depending upon whether the user has selected Antialiasing, you set the different pipeline state. The render target texture must match the same sample count as the pipeline state. ➤ Open Renderer.swift, and at the top of draw(scene:in:) before the guard, add this: view.sampleCount = options.antialiasing ? 4 : 1
The current render pass descriptor will use the sample count to create the render target texture with the correct antialiasing. ➤ Build and run the app. On modern retina devices, this effect can be quite difficult to see. But, if you zoom in to a straight line on a slope — such as the tree trunk — and toggle Antialiasing, you may notice the difference.
Antialiasing raywenderlich.com
506
Metal by Tutorials
Chapter 20: Fragment Post-Processing
Fog Let’s have a bit more fun and add some fog to the scene! Fog is quite useful in rendering. First, it serves as a far delimiter for rendered content. The renderer can ignore objects that get lost in the fog since they’re not visible anymore. Second, fog helps you avoid the popping-up effect that can happen when objects that are farther away from the camera “pop” into the scene as the camera moves closer. With fog, you can make their appearance into the scene more gradual. Note: Fog isn’t a post-processing effect, it’s added in the fragment shader. ➤ Open PBR.metal, and add a new function before fragment_PBR: float4 fog(float4 position, float4 color) { // 1 float distance = position.z / position.w; // 2 float density = 0.2; float fog = 1.0 - clamp(exp(-density * distance), 0.0, 1.0); // 3 float4 fogColor = float4(1.0); color = mix(color, fogColor, fog); return color; }
With this code, you: 1. Calculate the depth of the fragment position. 2. Define a distribution function that the fog will use next. It’s the inverse of the clamped (between 0 and 1) product between the fog density and the depth calculated in the previous step. 3. Mix the current color with the fog color (which you deliberately set to white) using the distribution function defined in the previous step.
raywenderlich.com
507
Metal by Tutorials
Chapter 20: Fragment Post-Processing
➤ Change the return value of fragment_PBR to: float4 color = float4(diffuseColor + specularColor, material.opacity); if (params.fog) { color = fog(in.position, color); } return color;
Here, you include the fog value in the final color. ➤ Build and run the app.
Fog Perfect, the entire scene is foggy. The closer you get to the tree, the less dense the fog. The same happens to the ground. Like with real fog, the closer you get to an object, the easier it is to see it. Check it out: get closer to the tree, and you’ll see it a lot better. Because this effect is worked in the fragment shader, the sky is not affected by fog. The sky color is coming from the MTKView instead of being rendered. In the next chapter, you’ll create a rendered sky that you can affect with fog.
raywenderlich.com
508
Metal by Tutorials
Chapter 20: Fragment Post-Processing
Key Points • Per-sample processing takes place in the GPU pipeline after the GPU processes fragments. • Using discard_fragment() in the fragment function halts further processing on the fragment. • To render only part of the texture, you can define a 2D scissor rectangle. The GPU discards any fragments outside of this rectangle. • You set up the pipeline state object with blending when you require transparency. You can then set the alpha value of the fragment in the fragment function. Without blending in the pipeline state object, all fragments are fully opaque, no matter their alpha value. • Multisample antialiasing improves render quality. You set up MSAA with the sampleCount in the pipeline state descriptor. • You can add fog with some clever distance shading in the fragment function.
Where to Go From Here? Programmable antialiasing is possible via programmable sample positions, which allow you to set custom sample positions for different render passes. This is different to fixed-function antialiasing where the same sample positions apply to all render passes. For further reading, you can review Apple’s Positioning Samples Programmatically (https://developer.apple.com/documentation/metal/ mtlrenderpassdescriptor/positioning_samples_programmatically) article.
raywenderlich.com
509
21
Chapter 21: Imaged-Based Lighting
In this chapter, you’ll add the finishing touches to rendering your environment. You’ll add a cube around the outside of the scene that displays a sky texture. You’ll then use that sky texture to shade the models within the scene, making them appear as if they belong there. Look at the following comparison of two renders.
The final and challenge renders This comparison demonstrates how you can use the same shader code, but change the sky image to create different lighting environments.
raywenderlich.com
510
Metal by Tutorials
Chapter 21: Imaged-Based Lighting
The Starter Project ➤ In Xcode, open the starter project for this chapter. ➤ Build and run the app.
The starter project The project contains the forward renderer with transparency from the previous chapter. The scene uses an arcball camera, and contains a ground plane and car. The scene lighting consists of one sunlight and the PBR shader provides the shading. There are a few additional files that you’ll use throughout the chapter. Common.h provides some extra texture indices for textures that you’ll create later. Aside from the darkness of the lighting, there are some glaring problems with the render: • All metals, such as the metallic wheel hubs, look dull. Pure metals reflect their surroundings, and there are currently no surroundings to reflect. • Where the light doesn’t directly hit the car, the color is pure black. This happens because the app doesn’t provide any ambient light. Later in this chapter, you’ll use the skylight as global ambient light.
The Skybox Currently, the sky is a single color, which looks unrealistic. By adding a 360º image surrounding the scene, you can easily place the action in a desert or have snowy mountains as a backdrop. To do this, you’ll create a skybox cube that surrounds the entire scene.
raywenderlich.com
511
Metal by Tutorials
Chapter 21: Imaged-Based Lighting
This skybox cube is the same as an ordinary model, but instead of viewing it from the outside, the camera is at the center of the cube looking out. You’ll texture the cube with a cube texture, which gives you a cheap way of creating a complete environment. You may think the cube will be distorted at the corners, but as you’ll see, each fragment of the cube will render at an effectively infinite distance, and no distortion will occur. Cube maps are much easier to create than spherical ones and are hardware optimized. ➤ In the Geometry group, create a new Swift file for the skybox class named Skybox.swift. ➤ Replace the default code with: import MetalKit struct Skybox { let mesh: MTKMesh var skyTexture: MTLTexture? let pipelineState: MTLRenderPipelineState let depthStencilState: MTLDepthStencilState? }
Going through the skybox properties: • mesh: A cube that you’ll create using a Model I/O primitive. • skyTexture: A cube texture of the name given in the initializer. This is the texture that you’ll see in the background. • pipelineState: The skybox needs a simple vertex and fragment function, therefore it needs its own pipeline. • depthStencilState: Each pixel of the skybox will be positioned at the very edge of normalized clip space. The default depth stencil state in RenderPass.swift renders the fragment if the fragment is less than the current depth value. The skybox depth stencil should test less than or equal to the current depth value. You’ll see why shortly. Your project won’t compile until you’ve initialized all stored properties. ➤ Add the initializer to Skybox: init(textureName: String?) { let allocator = MTKMeshBufferAllocator(device: Renderer.device)
raywenderlich.com
512
Metal by Tutorials
}
Chapter 21: Imaged-Based Lighting
let cube = MDLMesh( boxWithExtent: [1, 1, 1], segments: [1, 1, 1], inwardNormals: true, geometryType: .triangles, allocator: allocator) do { mesh = try MTKMesh( mesh: cube, device: Renderer.device) } catch { fatalError("failed to create skybox mesh") }
Here, you create a cube mesh. Notice that you set the normals to face inwards. That’s because the whole scene will appear to be inside the cube. ➤ In the Render Passes group, open Pipelines.swift, and add a new method: static func createSkyboxPSO( vertexDescriptor: MTLVertexDescriptor? ) -> MTLRenderPipelineState { let vertexFunction = Renderer.library?.makeFunction(name: "vertex_skybox") let fragmentFunction = Renderer.library?.makeFunction(name: "fragment_skybox") let pipelineDescriptor = MTLRenderPipelineDescriptor() pipelineDescriptor.vertexFunction = vertexFunction pipelineDescriptor.fragmentFunction = fragmentFunction pipelineDescriptor.colorAttachments[0].pixelFormat = Renderer.colorPixelFormat pipelineDescriptor.depthAttachmentPixelFormat = .depth32Float pipelineDescriptor.vertexDescriptor = vertexDescriptor return createPSO(descriptor: pipelineDescriptor) }
When you create the pipeline state, you’ll pass in the skybox cube’s Model I/O vertex descriptor. You’ll write the two new shader functions shortly. ➤ Open Skybox.swift, and add a new method to create the depth stencil state: static func buildDepthStencilState() -> MTLDepthStencilState? { let descriptor = MTLDepthStencilDescriptor() descriptor.depthCompareFunction = .lessEqual descriptor.isDepthWriteEnabled = true return Renderer.device.makeDepthStencilState( descriptor: descriptor) }
raywenderlich.com
513
Metal by Tutorials
Chapter 21: Imaged-Based Lighting
This creates the depth stencil state with the less than or equal comparison method mentioned earlier. ➤ Complete the initialization by adding the following code to the end of init(textureName:): pipelineState = PipelineStates.createSkyboxPSO( vertexDescriptor: MTKMetalVertexDescriptorFromModelIO( cube.vertexDescriptor)) depthStencilState = Self.buildDepthStencilState()
You initialize the skybox’s pipeline state with the vertex descriptor provided by Model I/O.
Rendering the Skybox ➤ Still in Skybox.swift, create a new method to perform the skybox rendering: func render( encoder: MTLRenderCommandEncoder, uniforms: Uniforms ) { encoder.pushDebugGroup("Skybox") encoder.setRenderPipelineState(pipelineState) // encoder.setDepthStencilState(depthStencilState) encoder.setVertexBuffer( mesh.vertexBuffers[0].buffer, offset: 0, index: 0) }
Here, you set up the render command encoder with the properties you initialized. Leave the depth stencil state line commented out for the moment. ➤ Add this code at the end of render(encoder:uniforms:): var uniforms = uniforms uniforms.viewMatrix.columns.3 = [0, 0, 0, 1] encoder.setVertexBytes( &uniforms, length: MemoryLayout.stride, index: UniformsBuffer.index)
When you render a scene, you multiply each model’s matrix with the view matrix and the projection matrix. As you move through the scene, it appears as if the camera is moving through the scene, but in fact, the whole scene is moving around the camera.
raywenderlich.com
514
Metal by Tutorials
Chapter 21: Imaged-Based Lighting
You don’t want the skybox to move, so you zero out column 3 of viewMatrix to remove the camera’s translation. However, you do still want the skybox to rotate with the rest of the scene, and also render with projection, so you send the uniform matrices to the GPU. ➤ Add the following code after the code you just added: let submesh = mesh.submeshes[0] encoder.drawIndexedPrimitives( type: .triangle, indexCount: submesh.indexCount, indexType: submesh.indexType, indexBuffer: submesh.indexBuffer.buffer, indexBufferOffset: 0) encoder.popDebugGroup()
Here, you draw the cube’s submesh.
The Skybox Shader Functions In the Shaders group, create a new Metal file named Skybox.metal. ➤ Add the following code to the new file: #import "Common.h" struct VertexIn { float4 position [[attribute(Position)]]; }; struct VertexOut { float4 position [[position]]; };
The structures are simple so far — you need a position in and a position out. ➤ Add the shader functions: vertex VertexOut vertex_skybox( const VertexIn in [[stage_in]], constant Uniforms &uniforms [[buffer(UniformsBuffer)]]) { VertexOut out; float4x4 vp = uniforms.projectionMatrix * uniforms.viewMatrix; out.position = (vp * in.position).xyww; return out; }
raywenderlich.com
515
Metal by Tutorials
Chapter 21: Imaged-Based Lighting
fragment half4 fragment_skybox( VertexOut in [[stage_in]]) { return half4(1, 1, 0, 1); }
Here, you create two very simple shaders — the vertex function moves the vertices to the projected position, and the fragment function returns yellow. This is a temporary color, which is startling enough that you’ll be able to see where the skybox renders. Notice in the vertex function that you swizzled the xyzw position to xyww. To place the sky as far away as possible, it needs to be at the very edge of NDC. During the change from clip space to NDC, the coordinates are all divided by w during the perspective divide stage. This will now result in the z coordinate being 1, which will ensure that the skybox renders behind everything else within the scene. The following diagram shows the skybox in camera space rotated by 45º. After projection and the perspective divide, the vertices will be flat against the far NDC plane.
Integrating the Skybox Into the Scene ➤ Open GameScene.swift, and add a new property to GameScene: let skybox: Skybox?
➤ Add the following code to the top of init(): skybox = Skybox(textureName: nil)
You haven’t written the code for the skybox texture yet, but soon you’ll set it up so that nil will generate a physically simulated sky, and providing a texture name will load that sky texture.
raywenderlich.com
516
Metal by Tutorials
Chapter 21: Imaged-Based Lighting
➤ In the Render Passes group, open ForwardRenderPass.swift, and in draw(commandBuffer:scene:uniforms:params:), locate // transparent mesh. ➤ Add the following code before that comment: scene.skybox?.render( encoder: renderEncoder, uniforms: uniforms)
You render the skybox only during the opaque pass, after the opaque meshes. It may seem odd that you’re rendering the skybox after rendering the scene models, when it’s going to be the object that’s behind everything else. Remember early-Z testing from Chapter 3, “The Rendering Pipeline”: when objects are rendered, most of the skybox fragments will be behind them and will fail the depth test. Therefore, it’s more efficient to render the skybox as late as possible. You have to render before the transparent pass, so that any transparency will include the skybox texture. You’ve now integrated the skybox into the rendering process. ➤ Build and run the app to see the new yellow sky.
A flickering sky As you rotate the scene, the yellow sky flickers and shows the blue of the metal view’s clear color. This happens because the current depth stencil state is from ForwardRenderPass, and it’s comparing new fragments to less than the current depth buffer. The skybox coordinates are right on the edge, so sometimes they’re equal to the edge of clip space. ➤ Open Skybox.swift, and in render(encoder:uniforms:), uncomment encoder.setDepthStencilState(depthStencilState), and build and run the app again.
raywenderlich.com
517
Metal by Tutorials
Chapter 21: Imaged-Based Lighting
This time, the depth comparison is correct, and the sky is the solid yellow returned from the skybox fragment shader.
A yellow sky
Procedural Skies Yellow skies might be appropriate on a different planet, but how about a procedural sky? A procedural sky is one built out of various parameters such as weather conditions and time of day. Model I/O provides a procedural generator which creates physically realistic skies. ➤ Before exploring this API further, open and run skybox.playground in the resources folder for this chapter. This scene contains only a ground plane and a skybox. Use your mouse or trackpad to reorient the scene, and experiment with the sliders under the view to see how you can change the sky depending on: • turbidity: Haze in the sky. 0.0 is a clear sky. 1.0 spreads the sun’s color. • sun elevation: How high the sun is in the sky. 0.5 is on the horizon. 1.0 is overhead. • upper atmosphere scattering: Atmospheric scattering influences the color of the sky from reddish through orange tones to the sky at midday. • ground albedo: How clear the sky is. 0 is clear, while 10 can produce intense colors. It’s best to keep turbidity and upper atmosphere scattering low if you have high albedo.
raywenderlich.com
518
Metal by Tutorials
Chapter 21: Imaged-Based Lighting
As you move the sliders, the result is printed in the debug console so you can record these values for later use. See if you can create a sunrise:
A sunrise This playground uses Model I/O to create an MDLSkyCubeTexture. From this, the playground creates an MTLTexture and applies this as a cube texture to the sky cube. You’ll now do this in your project.
Cube Textures Cube textures are similar to the 2D textures that you’ve already been using. 2D textures map to a quad and have two texture coordinates, whereas cube textures consist of six 2D textures: one for each face of the cube. You sample the textures with a 3D vector. The easiest way to load a cube texture into Metal is to use Model I/O’s MDLTexture initializer. When creating cube textures, you can arrange the images in various combinations:
raywenderlich.com
519
Metal by Tutorials
Chapter 21: Imaged-Based Lighting
Alternatively, you can create a cube texture in an asset catalog and load the six images there. ➤ Back in your project, in the Textures group, open Textures.xcassets. sky is a sky texture complete with mipmaps. The sky should always render on the base mipmap level 0, but you’ll see later why you would use the other mipmaps. Aside from there being six images to one texture, moving the images into the asset catalog and creating the mipmaps is the same process as described in Chapter 8, “Textures”.
The sky texture in the asset catalog
Adding the Procedural Sky You’ll use these sky textures shortly, but for now, you’ll add a procedural sky to your scene. ➤ Open Skybox.swift, and add these properties to Skybox: struct SkySettings { var turbidity: Float = 0.28 var sunElevation: Float = 0.6 var upperAtmosphereScattering: Float = 0.4 var groundAlbedo: Float = 0.8 } var skySettings = SkySettings()
You can use the values from the appropriate sliders in the playground if you prefer.
raywenderlich.com
520
Metal by Tutorials
Chapter 21: Imaged-Based Lighting
➤ Now, add the following method: func loadGeneratedSkyboxTexture(dimensions: SIMD2) -> MTLTexture? { var texture: MTLTexture? let skyTexture = MDLSkyCubeTexture( name: "sky", channelEncoding: .float16, textureDimensions: dimensions, turbidity: skySettings.turbidity, sunElevation: skySettings.sunElevation, upperAtmosphereScattering: skySettings.upperAtmosphereScattering, groundAlbedo: skySettings.groundAlbedo) do { let textureLoader = MTKTextureLoader(device: Renderer.device) texture = try textureLoader.newTexture( texture: skyTexture, options: nil) } catch { print(error.localizedDescription) } return texture }
Model I/O uses your settings to create the sky texture. That’s all there is to creating a procedurally generated sky texture! ➤ Call the new method at the end of init(textureName:): if let textureName = textureName { } else { skyTexture = loadGeneratedSkyboxTexture(dimensions: [256, 256]) }
You’ll add the if part of this conditional shortly and load the named texture. The nil option provides a default sky. To render the texture, you’ll change the skybox shader function and ensure that the texture gets to the GPU. ➤ Still in Skybox.swift, in render(encoder:uniforms:), add the following code before the draw call: encoder.setFragmentTexture( skyTexture, index: SkyboxTexture.index)
raywenderlich.com
521
Metal by Tutorials
Chapter 21: Imaged-Based Lighting
The starter project already has the necessary texture enumeration indices set up in Common.h for the skybox textures. ➤ Open Skybox.metal, and add a new property to VertexOut: float3 textureCoordinates;
Generally, when you load a model, you also load its texture coordinates. However, when sampling texels from a cube texture, instead of using a uv coordinate, you use a 3D vector. For example, a vector from the center of any unit cube passes through the far top left corner at [-1, 1, 1].
Skybox coordinates Conveniently, even though the skybox’s far top-left vertex position is [-0.5, 0.5, 0.5], it still lies on the same vector, so you can use the skybox vertex position for the texture coordinates. ➤ Add this code to vertex_skybox before return out;: out.textureCoordinates = in.position.xyz;
➤ Change fragment_skybox to: fragment half4 fragment_skybox( VertexOut in [[stage_in]], texturecube cubeTexture [[texture(SkyboxTexture)]]) { constexpr sampler default_sampler(filter::linear);
raywenderlich.com
522
Metal by Tutorials
}
Chapter 21: Imaged-Based Lighting
half4 color = cubeTexture.sample( default_sampler, in.textureCoordinates); return color;
Accessing a cube texture is similar to accessing a 2D texture. You mark the cube texture as texturecube in the shader function parameters and sample it using the textureCoordinates vector that you set up in the vertex function. ➤ Build and run the app, and you now have a realistic sky, simulating physics:
A procedural sky
Custom Sky Textures As mentioned earlier, you can use your own 360º sky textures. The textures included in the starter project were downloaded from Poly Haven (https://polyhaven.com/ hdris) — a great place to find environment maps. The HDRI has been converted into six tone mapped sky cube textures before adding them to the asset catalog. Note: If you want to create your own skybox textures or load HDRIs (high dynamic range images), you can find out how to do it in references.markdown included with this chapter’s files. Loading a cube texture is almost the same as loading a 2D texture.
raywenderlich.com
523
Metal by Tutorials
Chapter 21: Imaged-Based Lighting
➤ Open TextureController.swift, and examine loadCubeTexture(imageName:). Just as with loadTexture(imageName:), you can load either a cube texture from the asset catalog or one 2D image consisting of the six faces vertically. ➤ Open Skybox.swift, and at the end of init(textureName:), in the first half of the incomplete conditional, add this: do { skyTexture = try TextureController.loadCubeTexture( imageName: textureName) } catch { fatalError(error.localizedDescription) }
You load the texture using the given texture name. ➤ Open GameScene.swift, and in init(), change the skybox initialization to: skybox = Skybox(textureName: "sky")
➤ Build and run the app to see your new skybox texture.
The skybox Notice that as you move about the scene, although the skybox rotates with the rest of the scene, it does not reposition. You should be careful that the sky textures you use don’t have objects that appear to be close, as they will always appear to stay at the same distance from the camera. Sky textures should be for background only. This skybox texture is not a great fit as the background does not match the ground plane.
raywenderlich.com
524
Metal by Tutorials
Chapter 21: Imaged-Based Lighting
Reflection Now that you have something to reflect, you can easily implement reflection of the sky onto the car. When rendering the car, all you have to do is take the camera view direction, reflect it about the surface normal, and sample the skycube along the reflected vector for the fragment color for the car.
Reflection Included in the starter project is a new fragment shader already set up with textures. ➤ In the Shaders group, open IBL.metal, and examine fragment_IBL in that file. The shader reads all of the possible model textures and sets values for: • base color • normal • roughness • metallic • ambient occlusion • opacity All of these maps will interact with the sky texture and eventually provide a beautiful render. Currently, the fragment function returns just the base color.
raywenderlich.com
525
Metal by Tutorials
Chapter 21: Imaged-Based Lighting
➤ Open Pipelines.swift, and change the fragment function name in createForwardPSO() to: let fragmentFunction = Renderer.library?.makeFunction(name: "fragment_IBL")
➤ Build and run the app.
Color textures only The fragment shader renders the car and ground returning the texture base color. The glass windshield has transparency, so uses a different pipeline. It will render transparently with specular highlights, as it did before. ➤ Open Skybox.swift, and add this to Skybox: func update(encoder: MTLRenderCommandEncoder) { encoder.setFragmentTexture( skyTexture, index: SkyboxTexture.index) }
You send the skybox texture to the fragment shader. You’ll add other skybox textures to this method soon. ➤ Open ForwardRenderPass.swift, and in draw(commandBuffer:scene:uniforms:params:), add the following code before var params = params: scene.skybox?.update(encoder: renderEncoder)
The sky texture is now available to the GPU.
raywenderlich.com
526
Metal by Tutorials
Chapter 21: Imaged-Based Lighting
➤ Open IBL.metal, and add the skybox texture to the parameter list for fragment_IBL: texturecube skybox [[texture(SkyboxTexture)]]
You read the skybox texture using the type texturecube. Instead of providing 2D uv coordinates, you’ll provide 3D coordinates. You now calculate the camera’s reflection vector about the surface normal to get a vector for sampling the skybox texture. To get the camera’s view vector, you subtract the fragment world position from the camera position. ➤ At the end of fragment_IBL, before return color;, add this: float3 viewDirection = in.worldPosition.xyz - params.cameraPosition; viewDirection = normalize(viewDirection); float3 textureCoordinates = reflect(viewDirection, normal);
Here, you calculate the view vector and reflect it about the surface normal to get the vector for the cube texture coordinates. ➤ Now, add this: constexpr sampler defaultSampler(filter::linear); color = skybox.sample( defaultSampler, textureCoordinates); float4 copper = float4(0.86, 0.7, 0.48, 1); color = color * copper;
Here, you sample the skybox texture for a color and multiply it by a copper color. ➤ Build and run the app.
Reflections
raywenderlich.com
527
Metal by Tutorials
Chapter 21: Imaged-Based Lighting
The rendered scene now appears to be made of beautifully shiny copper. As you rotate the scene, using your mouse or trackpad, you can see the sky reflected in the scene models. Note: This is not a true reflection since you’re only reflecting the sky texture. If you place any objects in the scene, they won’t be reflected. However, this reflection is a fast and easy effect, and is often sufficient. ➤ In fragment_IBL, remove the code you just added, i.e., the lines from constexpr sampler defaultSampler... to color = color * copper;. You’ll replace this with new lighting code. You’ll get a compiler warning on textureCoordinates until you use it later.
Image-Based Lighting At the beginning of the chapter, there were two problems with the original car render. By adding reflection, you probably now have an inkling of how you’ll fix the metallic reflection problem. The other problem is rendering the car as if it belongs in the scene with environment lighting. IBL or Image-Based Lighting is one way of dealing with this problem. Using the sky image you can extract lighting information. For example, the parts of the car that face the sun in the sky texture should shine more than the parts that face away. The parts that face away shouldn’t be entirely dark but should have ambient light filled in from the sky texture. Epic Games developed a technique for Fortnite, which they adapted from Disney’s research, and this has become the standard technique for IBL in games today. If you want to be as physically correct as possible, there’s a link to their article on how to achieve this included with the references.markdown for this chapter. You’ll be doing an approximation of their technique, making use of Model I/O for the diffuse.
raywenderlich.com
528
Metal by Tutorials
Chapter 21: Imaged-Based Lighting
Diffuse Reflection Light comes from all around us. Sunlight bounces around and colors reflect. When rendering an object, you should take into account the color of the light coming from every direction.
Diffuse reflection This is somewhat of an impossible task, but you can use convolution to compute a cube map called an irradiance map from which you can extract lighting information. You won’t need to know the mathematics behind this: Model I/O comes to the rescue again! The diffuse reflection for the car will come from a second texture derived from the sky texture. ➤ Open Skybox.swift, and add a new property to Skybox to hold this diffuse texture: var diffuseTexture: MTLTexture?
➤ To create the diffuse irradiance texture, add this temporary method: mutating func loadIrradianceMap() { // 1 guard let skyCube = MDLTexture(cubeWithImagesNamed: ["cube-sky.png"]) else { return } // 2 let irradiance = MDLTexture.irradianceTextureCube( with: skyCube, name: nil, dimensions: [64, 64], roughness: 0.6) // 3 let loader = MTKTextureLoader(device: Renderer.device) do { diffuseTexture = try loader.newTexture( texture: irradiance,
raywenderlich.com
529
Metal by Tutorials
}
Chapter 21: Imaged-Based Lighting
options: nil) } catch { fatalError(error.localizedDescription) }
Going through this code: 1. Model I/O currently doesn’t load cube textures from the asset catalog, so, in the Textures group, your project has an image named cube-sky.png with the six faces included in it. Each of the faces is 128 x 128 pixels. 2. Use Model I/O to create the irradiance texture from the source image. Neither source nor destination textures have to be large, as the diffuse color is spread out. 3. Load the resultant MDLTexture to diffuseTexture. ➤ In Skybox.swift, add the following to the end of init(textureName:): loadIrradianceMap()
➤ In update(encoder:), add the following: encoder.setFragmentTexture( diffuseTexture, index: SkyboxDiffuseTexture.index)
This code will send the diffuse texture to the GPU. ➤ Open IBL.metal, and add the diffuse texture to the parameter list for fragment_IBL: texturecube skyboxDiffuse [[texture(SkyboxDiffuseTexture)]]
➤ At the end of fragment_IBL, add this before return color;: float4 diffuse = skyboxDiffuse.sample(textureSampler, normal); color = diffuse * float4(material.baseColor, 1);
The diffuse value doesn’t depend on the angle of view, so you sample the diffuse texture using the surface normal. You then multiply the result by the base color.
raywenderlich.com
530
Metal by Tutorials
Chapter 21: Imaged-Based Lighting
➤ Build and run the app. Because of the irradiance convolution, the app may take a minute or so to start. As you rotate about the car, you’ll notice it’s very slightly brighter where it faces the skybox sun.
Diffuse from irradiance ➤ Click the Capture GPU frame icon to enter the GPU Debugger, and look at the generated irradiance map.
raywenderlich.com
531
Metal by Tutorials
Chapter 21: Imaged-Based Lighting
You can choose which of the six faces to show below the texture.
Instead of generating the irradiance texture each time, you can save the irradiance map to a file and load it from there. Included in the resources folder for this chapter is a project named IrradianceGenerator. You can use this app to generate your irradiance maps. In your project, in the Textures group, there’s a touched-up irradiance map named irradiance.png that matches and brightens the sky texture. It’s time to switch to using this irradiance map for the diffuse texture instead of generating it. ➤ Open Skybox.swift, and in init(textureName:), locate where you load skyTexture in the do...catch, and add the following code immediately after loading skyTexture: diffuseTexture = try TextureController.loadCubeTexture( imageName: "irradiance.png")
➤ Remove the method loadIrradianceMap(), and also remove the call to loadIrradianceMap() at the end of init(textureName).
raywenderlich.com
532
Metal by Tutorials
Chapter 21: Imaged-Based Lighting
➤ Build and run the app to see results using the brighter prebuilt irradiance map.
Brighter irradiance
Specular Reflection The irradiance map provides the diffuse and ambient reflection, but the specular reflection is a bit more difficult. You may remember from Chapter 10, “Lighting Fundamentals”, that, whereas the diffuse reflection comes from all light directions, specular reflection depends upon the angle of view and the roughness of the material.
Specular reflection In Chapter 11, “Maps & Materials”, you had a foretaste of physically based rendering using the Cook-Torrance microfacet specular shading model. This model is defined as:
raywenderlich.com
533
Metal by Tutorials
Chapter 21: Imaged-Based Lighting
Where you provide the light direction (l), view direction (v) and the half vector (h) between l and v. As described, the functions are: • D: Geometric micro-facet slope distribution • F: Fresnel • G: Geometric attenuation Just as with the diffuse light, to get the accuracy of the incoming specular light, you need to take many samples, which is impractical in real-time rendering. Epic Games’s approach in their paper, Real Shading in Unreal Engine 4 (http:// blog.selfshadow.com/publications/s2013-shading-course/karis/ s2013_pbs_epic_notes_v2.pdf), is to split up the shading model calculation. They prefilter the sky cube texture with the geometry distribution for various roughness values. For each roughness level, the texture gets smaller and blurrier, and you can store these pre-filtered environment maps as different mipmap levels in the sky cube texture. Note: In the resources for this chapter, there’s a project named Specular, which uses the code from Epic Games’s paper. This project takes in six images — one for each cube face — and will generate pre-filtered environment maps for as many levels as you specify in the code. The results are placed in a subdirectory of Documents named specular, which you should create before running the project. You can then add the created .png files to the mipmap levels of the sky cube texture in your asset catalog. ➤ In Textures.xcassets, look at the sky texture. sky already contains the pre-filtered environment maps for each mip level.
Pre-filtered environment maps
raywenderlich.com
534
Metal by Tutorials
Chapter 21: Imaged-Based Lighting
BRDF Look-Up Table To calculate the final color, you use a Bidirectional Reflectance Distribution Function (BRDF) that takes in the actual roughness of the model and the current viewing angle and returns the scale and bias for the Fresnel and geometric attenuation contributions. You can encapsulate this BRDF in a look-up table (LUT) as a texture that behaves as a two-dimensional array. One axis is the roughness value of the object, and the other is the angle between the normal and the view direction. You input these two values as the UV coordinates and receive back a color. The red value contains the scale, and the green value contains the bias.
A BRDF LUT The more photorealistic you want your scene to be, the higher the level of mathematics you’ll need to know. In the resources folder for this chapter, in references.markdown, you’ll find links with suggested reading that explain the Cook-Torrance microfacet specular shading model. In the Utility/BRDF group, your project contains functions provided by Epic Games to create the BRDF look-up texture. You’ll now implement the compute shader that builds the BRDF look-up texture. ➤ Open Skybox.swift, and add a property for the new texture: var brdfLut: MTLTexture?
➤ At the end of init(textureName:), call the method supplied in the starter project to build the texture: brdfLut = Renderer.buildBRDF()
raywenderlich.com
535
Metal by Tutorials
Chapter 21: Imaged-Based Lighting
Renderer.buildBRDF() uses a complex compute shader in BRDF.metal to create a
new texture. ➤ Still in Skybox.swift, in update(encoder:), add the following code to send the texture to the GPU: encoder.setFragmentTexture( brdfLut, index: BRDFLutTexture.index)
➤ Build and run the app, and click the Capture GPU frame icon to verify the lookup texture created by the BRDF compute shader is available to the GPU.
BRDF LUT is on the GPU Notice the texture format is RG16Float. As a float format, this pixel format has a greater accuracy than RGBA8Unorm. All the necessary information is now on the GPU, so you need to receive the new BRDF look-up texture into the fragment shader and do the shader math. ➤ Open IBL.metal, and add the new parameter to fragment_IBL: texture2d brdfLut [[texture(BRDFLutTexture)]]
raywenderlich.com
536
Metal by Tutorials
Chapter 21: Imaged-Based Lighting
➤ At the end of fragment_IBL, replace return color; with this: // 1 constexpr sampler s(filter::linear, mip_filter::linear); float3 prefilteredColor = skybox.sample(s, textureCoordinates, level(material.roughness * 10)).rgb; // 2 float nDotV = saturate(dot(normal, -viewDirection)); float2 envBRDF = brdfLut.sample(s, float2(material.roughness, nDotV)).rg; return float4(envBRDF, 0, 1);
Going through the code: 1. Read the skybox texture along the reflected vector as you did earlier. Using the extra parameter level(n), you can specify the mip level to read. You sample the appropriate mipmap for the roughness of the fragment. 2. Calculate the angle between the view direction and the surface normal, and use this as one of the UV coordinates to read the BRDF look-up texture. The other coordinate is the roughness of the surface. You receive back the red and green values which you’ll use to calculate the second part of the Cook Torrence equation. ➤ Build and run the app to see the result of the BRDF look-up.
The BRDF look up result At glancing angles on the car, the result is green.
raywenderlich.com
537
Metal by Tutorials
Chapter 21: Imaged-Based Lighting
Fresnel Reflectance When light hits an object straight on, some of the light is reflected. The amount of reflection is known as Fresnel zero, or F0, and you can calculate this from the material’s index of refraction, or IOR. When you view an object, at the viewing angle of 90º, the surface becomes nearly 100% reflective. For example, when you look across the water, it’s reflective; but when you look straight down into the water, it’s non-reflective.
Most dielectric (non-metal) materials have an F0 of about 4%, so most rendering engines use this amount as standard. For metals, F0 is the base color. ➤ Replace return float4(envBRDF, 0, 1); with: float3 f0 = mix(0.04, material.baseColor.rgb, material.metallic); float3 specularIBL = f0 * envBRDF.r + envBRDF.g;
Here, you choose F0 as 0.04 for non-metals and the base color for metals. metallic should be a binary value of 0 or 1, but it’s best practice to avoid conditional branching in shaders, so you use mix(). You then calculate the second part of the rendering equation using the values from the look-up table. ➤ Add the following code after the code you just added: float3 specular = prefilteredColor * specularIBL; color += float4(specular, 1); return color; color now includes the diffuse from the irradiance skybox texture, the material base
color and the specular value.
raywenderlich.com
538
Metal by Tutorials
Chapter 21: Imaged-Based Lighting
➤ Build and run the app.
Diffuse and specular Your car render is almost complete. Non-metals take the roughness value — the seats are matte, and the car paint is shiny. Metals reflect but take on the base color — the base color of the wheel hubs and the steel bar behind the seats is gray.
Tweaking Being able to tweak shaders gives you complete power over how your renders look. Because you’re using low dynamic range lighting, the non-metal diffuse color looks a bit dark. You can tweak the color very easily. ➤ In fragment_IBL, after float4 diffuse = skyboxDiffuse.sample(textureSampler, normal);, add this: diffuse = mix(pow(diffuse, 0.2), diffuse, material.metallic); diffuse *= calculateShadow(in.shadowPosition, shadowTexture);
This code raises the power of the diffuse value but only for non-metals. You also reinstate the shadow. ➤ Build and run, and the car body is a lot brighter.
Tweaking the shader raywenderlich.com
539
Metal by Tutorials
Chapter 21: Imaged-Based Lighting
The finishing touch will be to add a fake shadow effect using ambient occlusion. At the rear of the car, the exhausts look as if they are self-lit:
They should be shadowed because they are recessed. This is where ambient occlusion maps come in handy.
Ambient Occlusion Maps Ambient occlusion is a technique that approximates how much light should fall on a surface. If you look around you — even in a bright room — where surfaces are very close to each other, they’re darker than exposed surfaces. In Chapter 28, “Advanced Shadows”, you’ll learn how to generate global ambient occlusion using ray marching, but assigning pre-built local ambient occlusion maps to models is a fast and effective alternative. Apps such as Adobe Substance Painter can examine the model for proximate surfaces and produce an ambient occlusion map. This is the AO map for the car, which is included in your project.
raywenderlich.com
540
Metal by Tutorials
Chapter 21: Imaged-Based Lighting
The white areas on the left, with a color value of 1.0, are UV mapped to the car paint. These are fully exposed areas. When you multiply the final render color by 1.0, it’ll be unaffected. However, you can identify the wheel at the bottom right of the AO map, where the spokes are recessed. Those areas have a color value of perhaps 0.8, which darkens the final render color. The ambient occlusion map is all set up in the starter project and ready for you to use. ➤ Open IBL.metal, and in fragment_IBL, just before the final return, add this: color *= material.ambientOcclusion;
➤ Build and run the app, and compare the exhaust pipes to the previous render.
All of the recessed areas are darker, which gives more natural lighting to the model.
raywenderlich.com
541
Metal by Tutorials
Chapter 21: Imaged-Based Lighting
Challenge On the first page of this chapter is a comparison of the car rendered in two different lighting situations. Your challenge is to create the red lighting scene. Provided in the resources folder for this chapter, are six cube face png images converted from an HDRI downloaded from Poly Haven (https://polyhaven.com). 1. Create an irradiance map using the included IrradianceGenerator project, and import the generated map into the project. 2. Create specular mipmap levels using the included Specular project. 3. Create a new cube texture in the asset catalog. 4. Assign this new cube texture the appropriate generated mipmap images. 5. Change the sun light’s position to [-1, 0.5, 2] to match the skybox. Aside from the light position, there’s no code to change — it’s all imagery! You’ll find the completed project in the challenge directory for this chapter.
raywenderlich.com
542
Metal by Tutorials
Chapter 21: Imaged-Based Lighting
Key Points • Using a cuboid skybox, you can surround your scene with a texture. • Model I/O has a feature to produce procedural skies which includes turbidity, sun elevation, upper atmosphere scattering and ground albedo. • Cube textures have six faces. Each of the faces can have mipmaps. • Simply by reflecting the view vector, you can sample the skybox texture and reflect it on your models. • Image-based lighting uses the sky texture for lighting. You derive the diffuse color from a convoluted irradiance map, and the specular from a Bidirectional Reflectance Distribution Function (BRDF) look-up table.
Where to Go From Here? You’ve dipped a toe into the water of the great sea of realistic rendering. If you want to explore more about this fascinating topic, references.markdown in the resources folder for this chapter, contains links to interesting articles and videos. This chapter did not touch on spherical harmonics, which is an alternative method to using an irradiance texture map for diffuse reflection. Mathematically, you can approximate that irradiance map with 27 floats. Hopefully, the links in references.markdown will get you interested in this amazing technique. Before you try to achieve the ultimate realistic render, one question you should ask yourself is whether your game will actually benefit from realism. One way to stand out from the crowd is to create your own rendering style. Games such as Fortnite aren’t entirely realistic and have a style all of their own. Experiment with shaders to see what you can create.
raywenderlich.com
543
22
Chapter 22: Reflection & Refraction
When you create your game environments, you may need lakes of shimmering water or crystal balls. To look realistic, shiny glass objects require both reflection and refraction. Reflection is one of the most common interactions between light and objects. Imagine looking into a mirror. Not only would you see your image being reflected, but you’d also see the reflection of any nearby objects. Refraction is another common interaction between light and objects that you often see in nature. While it’s true that most objects in nature are opaque — thus absorbing most of the light they get — the few objects that are translucent, or transparent, allow for the light to propagate through them.
Reflection and refraction
raywenderlich.com
544
Metal by Tutorials
Chapter 22: Reflection & Refraction
Later, in the final section of this book, you’ll investigate ray tracing and global illumination, which allow advanced effects such as bounced reflections and realistic refraction. We’re approaching a time where ray tracing algorithms may be viable in games, but for now, real-time rendering with rasterized reflection and refraction is the way to go. An exemplary algorithm for creating realistic water was developed by Michael Horsch in 2005 (https://bit.ly/3H2P1ix). This realistic water algorithm is purely based on lighting and its optical properties, as opposed to having a water simulation based on physics.
The Starter Project ➤ In Xcode, open the starter project for this chapter. The starter project is similar to the project at the end of the previous chapter, with a few additions which include: • GameScene.swift contains a new scene with new models and renders a new skybox texture. You can move around the scene using WASD keys, and look about using the mouse or trackpad. Scrolling the mouse wheel, or pinching on iOS, moves you up and down, so you can get better views of your lake. • WaterRenderPass.swift, in the Render Passes group, contains a new render pass. It’s similar to ForwardRenderPass, but refactors the command encoder setup into a new render method. WaterRenderPass is all set up and ready to render in Renderer. • Water.swift, in the Geometry group, contains a new Water class, similar to Model. The class loads a primitive mesh plane and is set up to render the plane with its own pipeline state. • Pipelines.swift has new pipeline state creation methods to render water and a terrain.
raywenderlich.com
545
Metal by Tutorials
Chapter 22: Reflection & Refraction
➤ Build and run the app.
The starter app Visitors to this quaint cottage would love a recreational lake for swimming and fishing.
Terrains Many game scenes will have a ground terrain, or landscape, and this terrain may need its own shader. The starter project includes Terrain.swift, which contains Terrain, a subclass of Model. Changing shaders entails loading a new pipeline state, so Terrain creates its own pipeline state object along with a texture for use later. Terrain.metal holds the fragment function to render the terrain. After you’ve added some water, you’ll change the terrain texture to blend with an underwater texture. Instead of including the ground when rendering the scene models, ForwardRenderPass renders the terrain separately, as you did for the skybox.
Rendering Rippling Water Here’s the plan on how you’ll proceed through the chapter: 1. Render a large horizontal quad that will be the surface of the water. 2. Render the scene to a reflection texture. 3. Use a clipping plane to limit what geometry you render.
raywenderlich.com
546
Metal by Tutorials
Chapter 22: Reflection & Refraction
4. Distort the reflection using a normal map to create ripples on the surface. 5. Render the scene to a refraction texture. 6. Apply the Fresnel effect so that the dominance of each texture will change depending on the viewing angle. 7. Add smoothness to the water depth visibility using a depth texture. Ready? It’s going to be a wild ride but stick around until the end, because you won’t want to miss this.
1. Creating the Water Surface ➤ In the Geometry group, open Water.swift, and examine the code. Similar to Model, Water initializes a mesh and is Transformable, so you can position, rotate and scale the mesh. The mesh is a plane primitive. Water also has a render method where you’ll add textures and render the mesh plane. ➤ Open Pipelines.swift, and locate createWaterPSO(vertexDescriptor:). The water pipeline will need new shader functions. You’ll name the vertex function vertex_water and the fragment function fragment_water.
Creating the Water Shaders ➤ In the Shaders group, create a new Metal file named Water.metal, and add this: #import "Common.h" struct VertexIn { float4 position [[attribute(Position)]]; float2 uv [[attribute(UV)]]; }; struct VertexOut { float4 position [[position]]; float4 worldPosition; float2 uv; }; vertex VertexOut vertex_water( const VertexIn in [[stage_in]], constant Uniforms &uniforms [[buffer(UniformsBuffer)]]) {
raywenderlich.com
547
Metal by Tutorials
}
Chapter 22: Reflection & Refraction
float4x4 mvp = uniforms.projectionMatrix * uniforms.viewMatrix * uniforms.modelMatrix; VertexOut out { .position = mvp * in.position, .uv = in.uv, .worldPosition = uniforms.modelMatrix * in.position }; return out;
fragment float4 fragment_water( VertexOut in [[stage_in]], constant Params ¶ms [[buffer(ParamsBuffer)]]) { return float4(0.0, 0.3, 0.5, 1.0); }
This code provides a minimal configuration for rendering the water surface quad. The vertex function moves the mesh into position, and the fragment function shades the mesh a bluish color.
Adding the Water to Your Scene ➤ Open GameScene.swift, and add a new property: var water: Water?
➤ Add the following code to init(): water = Water() water?.position = [0, -1, 0]
With this code, you initialize and position the water plane. ➤ Open ForwardRenderPass.swift, and in draw(commandBuffer:scene:uniforms:params:), locate where you render the
skybox. ➤ Add the following code immediately before rendering the skybox: scene.water?.render( encoder: renderEncoder, uniforms: uniforms, params: params)
raywenderlich.com
548
Metal by Tutorials
Chapter 22: Reflection & Refraction
➤ Build and run the app.
The initial water plane The cottage is now a waterfront vacation home.
2. Rendering the Reflection The water plane should reflect its surroundings. In Chapter 21, “Image-Based Lighting”, you reflected the skybox onto objects, but this time you’re also going to reflect the house and terrain on the water. You’re going to render the scene to a texture from a point underneath the water pointing upwards. You’ll then take this texture and render it flipped on the water surface.
You’ll do all this in WaterRenderPass. ➤ Open WaterRenderPass.swift.
raywenderlich.com
549
Metal by Tutorials
Chapter 22: Reflection & Refraction
WaterRenderPass initializes the pipeline and depth stencil states. Each frame, Renderer calls waterRenderPass.draw(commandBuffer:scene:uniforms:params:), which
renders the entire scene (minus the water). Currently the render pass doesn’t render anything, because descriptor is nil and the app exits the method. ➤ In WaterRenderPass, add this to init(): descriptor = MTLRenderPassDescriptor()
You initialize a new render pass descriptor. ➤ Add some new texture properties to WaterRenderPass: var reflectionTexture: MTLTexture? var refractionTexture: MTLTexture? var depthTexture: MTLTexture?
You’ll keep both a reflection and a refraction texture, as you’ll render these textures from different camera positions. Although you’re setting up the refraction texture, you won’t be using it until later in the chapter. ➤ In resize(view:size:), add this: let size = CGSize( width: size.width / 2, height: size.height / 2) reflectionTexture = Self.makeTexture( size: size, pixelFormat: view.colorPixelFormat, label: "Reflection Texture") refractionTexture = Self.makeTexture( size: size, pixelFormat: view.colorPixelFormat, label: "Refraction Texture") depthTexture = Self.makeTexture( size: size, pixelFormat: .depth32Float, label: "Reflection Depth Texture")
Any time you can save on memory, you should. For reflection and refraction you don’t really need sharp images, so you create the textures at half the usual size.
raywenderlich.com
550
Metal by Tutorials
Chapter 22: Reflection & Refraction
➤ Add the following code to the top of draw(commandBuffer:scene:uniforms:params:): let attachment = descriptor?.colorAttachments[0] attachment?.texture = reflectionTexture attachment?.storeAction = .store let depthAttachment = descriptor?.depthAttachment depthAttachment?.texture = depthTexture depthAttachment?.storeAction = .store
You’ll render color to the reflectionTexture, depth to depthTexture and ensure that the GPU stores the textures for later use. Because you’ll render the skybox — which will cover the whole render target — you don’t care what the load actions are. The water plane will use these render textures, so you’ll store them to Water. ➤ In the Geometry group, open Water.swift, and add the new properties to Water: weak var reflectionTexture: MTLTexture? weak var refractionTexture: MTLTexture? weak var refractionDepthTexture: MTLTexture?
➤ In render(encoder:uniforms:params:), add the following code before the draw call: encoder.setFragmentTexture( reflectionTexture, index: 0) encoder.setFragmentTexture( refractionTexture, index: 1)
Soon, you’ll change the fragment_water shader to use these textures. ➤ Open WaterRenderPass.swift, and add the following code to the top of draw(commandBuffer:scene:uniforms:params:): guard let water = scene.water else { return } water.reflectionTexture = reflectionTexture water.refractionTexture = refractionTexture water.refractionDepthTexture = depthTexture
Here, you pass on the render target textures to Water for texturing the water plane. ➤ Build and run the app to see progress so far.
raywenderlich.com
551
Metal by Tutorials
Chapter 22: Reflection & Refraction
You won’t see any obvious changes, but capture the GPU workload and check out the water render pass:
WaterRenderPass renders an exact duplicate of the scene, not including the water
plane, to the reflection render target texture. ➤ In the Shaders group, open Water.metal, and add new parameters to fragment_water: texture2d reflectionTexture [[texture(0)]], texture2d refractionTexture [[texture(1)]]
You’re concentrating on reflection for now, but you’ll use the refraction texture later. ➤ In fragment_water, replace the return line with this: // 1 constexpr sampler s(filter::linear, address::repeat); // 2 float width = float(reflectionTexture.get_width() * 2.0); float height = float(reflectionTexture.get_height() * 2.0); float x = in.position.x / width; float y = in.position.y / height; float2 reflectionCoords = float2(x, 1 - y); // 3 float4 color = reflectionTexture.sample(s, reflectionCoords); color = mix(color, float4(0.0, 0.3, 0.5, 1.0), 0.3); return color;
raywenderlich.com
552
Metal by Tutorials
Chapter 22: Reflection & Refraction
Going through the code: 1. Create a new sampler with linear filtering and repeat addressing mode, so that the texture tiles at the edge if necessary. 2. Determine the reflection coordinates which will use an inverted y value because the reflected image is a mirror of the scene above the water surface. Notice you multiply by 2.0. You do this because the texture is only half-size. 3. Sample the color from the reflection texture, and mix it a little with the previous bluish color the water plane had before. ➤ Build and run the app, and you’ll see the house, terrain and sky reflected on the water surface.
Initial reflection It looks nice now, but rotate the camera with your mouse or trackpad, and you’ll see the reflection is incorrect. You need to render the reflection target from a different camera position.
Incorrect reflection
raywenderlich.com
553
Metal by Tutorials
Chapter 22: Reflection & Refraction
➤ Open WaterRenderPass.swift, and in draw(commandBuffer:scene:uniforms:params:), add the following code before calling render(renderEncoder:scene:uniforms:params:). var reflectionCamera = scene.camera reflectionCamera.rotation.x *= -1 let position = (scene.camera.position.y - water.position.y) * 2 reflectionCamera.position.y -= position var uniforms = uniforms uniforms.viewMatrix = reflectionCamera.viewMatrix
Here, you use a separate camera specially for reflection, and position it below the surface of the water to capture what’s above the surface. ➤ Build and run the app to see the updated result:
Reflected camera position That didn’t work out too well. As the main camera moves up the y-axis, the reflection camera moves down the yaxis to below the terrain surface which blocks the view to the sky. You could temporarily solve this by culling the terrain’s back faces when you render, but this may introduce other rendering artifacts. A better way of dealing with this issue is to clip the geometry you don’t want to render.
raywenderlich.com
554
Metal by Tutorials
Chapter 22: Reflection & Refraction
3. Creating Clipping Planes A clipping plane, as its name suggests, clips the scene using a plane. It’s hardware accelerated, meaning that if geometry is not within the clip range, the GPU immediately discards the vertex and doesn’t put it through the entire pipeline. You may get a significant performance boost as some of the geometry will not need to get processed by the fragment shaders anymore. For the reflection texture, you only need to render the scene as if from under the water, flip it, and add it to the final render. Placing the clipping plane at the level of the water, ensures that only the scene geometry above the water is rendered to the reflection texture.
The clipping plane ➤ Still in WaterRenderPass.swift, in draw(commandBuffer:scene:uniforms:params:), after the previous code, add this: var clipPlane = float4(0, 1, 0, -water.position.y) uniforms.clipPlane = clipPlane
With this code, you create clipPlane as a var because you’ll adjust it shortly for refraction. The clipping plane xyz is a direction vector that denotes the clipping direction. The last component is the level of the water. ➤ In the Shaders group, open Common.h, and add a new member to Uniforms: vector_float4 clipPlane;
➤ Open Vertex.h, and add a new member to VertexOut: float clip_distance [[clip_distance]] [1];
raywenderlich.com
555
Metal by Tutorials
Chapter 22: Reflection & Refraction
Notice the Metal Shading Language attribute, clip_distance, which is one of the built-in attributes exclusively used by vertex shaders. The clip_distance attribute is an array of distances, and the [1] argument represents its size — a 1 in this case because you only need one member in the array. You may have noticed that FragmentIn is a duplicate of VertexOut. [[clip_distance]] is a vertex-only attribute, and fragment functions now won’t compile if they use VertexOut. Note: You can read more about matching vertex and fragment attributes in the Metal Shading Language specification (https://developer.apple.com/metal/ Metal-Shading-Language-Specification.pdf), in Section 5.7.1 “VertexFragment Signature Matching.” ➤ Open Shaders.metal, and add this to vertex_main before return: out.clip_distance[0] = dot(uniforms.modelMatrix * in.position, uniforms.clipPlane);
Any negative result in vertex_out.clip_distance[0] will result in the vertex being clipped. You’re now clipping any geometry processed by vertex_main. You could also clip the skybox in the same way, but when you later add ripples, they may go below the clipping plane, leaving nothing to reflect. ➤ Build and run the app, capture the GPU workload and select the Water Render Pass.
Rendering above the clipping plane raywenderlich.com
556
Metal by Tutorials
Chapter 22: Reflection & Refraction
All rendered scene model geometry is clipped, but the skybox still renders. ➤ Continue running the app, move the camera up and rotate it to look downwards.
Reflecting the sky correctly The sky now reflects correctly and your water reflection appears smooth and calm. As you move the camera up and down, the reflection is now consistent. Still water, no matter how calming, isn’t realistic. Time to give that water some ripples.
4. Rippling Normal Maps ➤ Open Textures.xcassets, and select normal-water. This is a normal map that you’ll use for the water ripples.
The water ripple normal map You’ll tile this map across the water and move it, perturbing the water normals, which will make the water appear to ripple. raywenderlich.com
557
Metal by Tutorials
Chapter 22: Reflection & Refraction
➤ In the Geometry group, open Water.swift, and add these new properties to Water: var waterMovementTexture: MTLTexture? var timer: Float = 0
You add the texture, and a timer so that you can animate the normals. ➤ Add the following code to the end of init(): waterMovementTexture = try? TextureController.loadTexture(filename: "normal-water")
➤ In render(encoder:uniforms:params:), add the following code where you set the other fragment textures: encoder.setFragmentTexture( waterMovementTexture, index: 2) var timer = timer encoder.setFragmentBytes( &timer, length: MemoryLayout.size, index: 3)
Here, you send the texture and the timer to the fragment shader. ➤ Add this new method to Water: func update(deltaTime: Float) { let sensitivity: Float = 0.005 timer += deltaTime * sensitivity } deltaTime will be too fast, so you include a sensitivity modifier.
➤ Open GameScene.swift, and add this to the top of update(deltaTime:): water?.update(deltaTime: deltaTime) GameScene will update the water timer every frame.
You’ve now set up the texture and the timer on the CPU side. ➤ Open Water.metal, and add two new parameters for the texture and timer to fragment_water: texture2d normalTexture [[texture(2)]],
raywenderlich.com
558
Metal by Tutorials
Chapter 22: Reflection & Refraction
constant float& timer [[buffer(3)]]
➤ Add the following code before you define color: // 1 float2 uv = in.uv * 2.0; // 2 float waveStrength = 0.1; float2 rippleX = float2(uv.x + timer, uv.y); float2 rippleY = float2(-uv.x, uv.y) + timer; float2 ripple = ((normalTexture.sample(s, rippleX).rg * 2.0 - 1.0) + (normalTexture.sample(s, rippleY).rg * 2.0 - 1.0)) * waveStrength; reflectionCoords += ripple; // 3 reflectionCoords = clamp(reflectionCoords, 0.001, 0.999);
Going through the code: 1. Get the texture coordinates and multiply them by a tiling value. For 2 you get huge, ample ripples, while for 16 you get quite small ripples. Pick a value that suits your needs. 2. Calculate ripples by distorting the texture coordinates with the timer value. Only grab the R and G values from the sampled texture because they are the U and V coordinates that determine the horizontal plane where the ripples will be. The B value is not important here. waveStrength is an attenuator value, that gives you weaker or stronger waves. 3. Clamp the reflection coordinates to eliminate anomalies around the margins of the screen. ➤ Build and run the app, and you’ll see gorgeous ripples on the water surface.
Calming water ripples raywenderlich.com
559
Metal by Tutorials
Chapter 22: Reflection & Refraction
5. Adding Refraction Implementing refraction is very similar to reflection, except that you only need to preserve the part of the scene below the clipping plane.
Rendering below the clipping plane ➤ Open WaterRenderPass.swift, and add this to the end of draw(commandBuffer:scene:uniforms:params:): // 1 descriptor.colorAttachments[0].texture = refractionTexture // 2 guard let refractEncoder = commandBuffer.makeRenderCommandEncoder( descriptor: descriptor) else { return } refractEncoder.label = "Refraction" // 3 uniforms.viewMatrix = scene.camera.viewMatrix clipPlane = float4(0, -1, 0, -water.position.y) uniforms.clipPlane = clipPlane // 4 render( renderEncoder: refractEncoder, scene: scene, uniforms: uniforms, params: params) refractEncoder.endEncoding()
Going through the code: 1. Set the refraction texture as the render target. 2. Create a new render command encoder. 3. Set the y value of the clip plane to -1 since the camera is now in its original position and pointing down toward the water. 4. Render all the elements of the scene again.
raywenderlich.com
560
Metal by Tutorials
Chapter 22: Reflection & Refraction
➤ Build and run the app, capture the GPU workload and check out the refraction texture.
The refraction texture This time, the GPU has only rendered the scene geometry below the clipping plane. You’re already sending the refraction texture to the GPU, so you can work on the calculations straightaway. ➤ Open Water.metal, and, in fragment_water, add this below where you define reflectionCoords: float2 refractionCoords = float2(x, y);
For refraction you don’t have to flip the y coordinate. ➤ Similarly, add this below reflectionCoords += ripple;: refractionCoords += ripple;
➤ And once more, add this after the reflection line preventing edge anomalies: refractionCoords = clamp(refractionCoords, 0.001, 0.999);
➤ Finally, replace: float4 color = reflectionTexture.sample(s, reflectionCoords);
➤ With: float4 color = refractionTexture.sample(s, refractionCoords);
raywenderlich.com
561
Metal by Tutorials
Chapter 22: Reflection & Refraction
You temporarily show only the refraction texture. You’ll return and include reflection shortly. ➤ Build and run the app.
Refraction The reflection on the water surface is gone, and instead, you have refraction through the water. There’s one more visual enhancement you can make to your water to make it more realistic: adding rocks and grime. Fortunately, the project already has a texture that can simulate this. ➤ Open Terrain.metal, and in fragment_terrain, uncomment the section under // uncomment this for pebbles. ➤ Build and run the app, and you’ll now see a pebbled texture underwater.
Pebbles The holy grail of realistic water, however, is having a Fresnel effect that harmoniously combines reflection and refraction based on the viewing angle.
raywenderlich.com
562
Metal by Tutorials
Chapter 22: Reflection & Refraction
6. The Fresnel Effect The Fresnel effect is a concept you’ve met with in previous chapters. As you may remember, the viewing angle plays a significant role in the amount of reflection you can see. What’s new in this chapter is that the viewing angle also affects refraction but in inverse proportion: • The steeper the viewing angle is, the weaker the reflection and the stronger the refraction. • The shallower the viewing angle is, the stronger the reflection and the weaker the refraction. The Fresnel effect in action:
➤ Open Water.metal, and in fragment_water, before you define color, add this: float3 viewVector = normalize(params.cameraPosition - in.worldPosition.xyz); float mixRatio = dot(viewVector, float3(0, 1, 0)); return mixRatio;
Here, you work out the view vector between the camera and the water fragment. The mix ratio will be the blend between reflection and refraction.
raywenderlich.com
563
Metal by Tutorials
Chapter 22: Reflection & Refraction
➤ Build and run the app.
The mix ratio between refraction and reflection As you move about the scene, the greater the angle between the camera and the water, the whiter the water becomes. A view across the water, down close to the water, returns black. Instead of rendering black and white, you’ll mix between the refraction and reflection textures. Where the mix ratio is black, you’ll render the reflection texture, and where it’s white, refraction. A ratio of 0.5 would mean that reflection and refraction are mixed equally. ➤ Replace: return mixRatio; float4 color = refractionTexture.sample(s, refractionCoords);
➤ With: float4 color = mix(reflectionTexture.sample(s, reflectionCoords), refractionTexture.sample(s, refractionCoords), mixRatio);
➤ Build and run the app.
raywenderlich.com
564
Metal by Tutorials
Chapter 22: Reflection & Refraction
Move the camera around and notice how reflection predominates for a small viewing angle while refraction predominates when the viewing angle is getting closer to 90 degrees (perpendicular to the water surface).
7. Adding Smoothness Using a Depth Texture Light propagation varies for different transparent media, but for water, the colors with longer wavelengths (closer to infrared) quickly fade away as the light ray goes deeper. The bluish colors (closer to ultraviolet) tend to be visible at greater depths because they have shorter wavelengths. At very shallow depths, however, most light should still be visible. You’ll make the water look smoother as depth gets smaller. You can improve the way the water surface blends with the terrain by using a depth map. ➤ Open Water.swift, and add the following code to render(encoder:uniforms:params:) when you set the other fragment textures: encoder.setFragmentTexture( refractionDepthTexture, index: 3)
As well as sending the refraction texture from the refraction render pass, you’re now sending the depth texture too. ➤ Open Pipelines.swift, and add this to createWaterPSO(vertexDescriptor:) before the return: let attachment = pipelineDescriptor.colorAttachments[0] attachment?.isBlendingEnabled = true attachment?.rgbBlendOperation = .add attachment?.sourceRGBBlendFactor = .sourceAlpha attachment?.destinationRGBBlendFactor = .oneMinusSourceAlpha
Here, you configure the blending options on the color attachment just as you did back in Chapter 20, “Fragment Post-Processing.” ➤ Open Water.metal, and add the depth texture parameter to fragment_water: depth2d depthMap [[texture(3)]]
raywenderlich.com
565
Metal by Tutorials
Chapter 22: Reflection & Refraction
➤ Add the following code before you set rippleX and rippleY: float float float float float float depth float depth
far = 100; // the camera's far plane near = 0.1; // the camera's near plane proj33 = far / (far - near); proj43 = proj33 * -near; depth = depthMap.sample(s, refractionCoords); floorDistance = proj43 / (depth - proj33); = in.position.z; waterDistance = proj43 / (depth - proj33); = floorDistance - waterDistance;
You convert the non-linear depth to a linear value. Note: Why and how you convert from non-linear to linear, is mathematically complex. gamedev.net forums (https://bit.ly/3r086fK) has an explanation of converting a non-linear depth buffer value to a linear depth value. Finally, add this before return: color.a = clamp(depth * 0.75, 0.0, 1.0);
Here, you change the alpha channel so that blending goes into effect. ➤ Build and run the app, and you’ll now see a smoother blending of the shore with the terrain.
Blending at the water's edge
raywenderlich.com
566
Metal by Tutorials
Chapter 22: Reflection & Refraction
Key Points • Reflection and refraction are important for realistic water and glass. • Rasterizing reflections and refraction will not produce as good a result as ray tracing. But when speed is a concern, then ray tracing is not often viable. • Use separate render passes to render textures. For reflection, move the camera in the inverse direction from the plane to be reflected and flip the result. • You already know about near and far clipping planes, but you can also add your own custom clipping planes. A negative clip distance from in the vertex function will result in the GPU discarding the vertex. • You can animate normal maps to provide water turbulence. • The Fresnel effect depends upon viewing angle and affects reflection and refraction in inverse proportion.
Where to Go From Here? You’ve certainly made a splash with this chapter! If you want to explore more about water rendering, references.markdown file in the resources folder for this chapter contains links to interesting articles and videos.
raywenderlich.com
567
23
Chapter 23: Animation
Rendering models that don’t move is a wonderful achievement, but animating models takes things to an entirely new level. To animate means to bring to life. So what better way to play with animation than to render characters with personality and body movement. In this chapter, you’ll find out how to do basic animation using keyframes.
raywenderlich.com
568
Metal by Tutorials
Chapter 23: Animation
The Starter Project ➤ In Xcode, open the starter project for this chapter, and build and run the app.
The scene contains a ground plane and a ball. Because there’s no skybox, the renderer will use the forward renderer with PBR shading. In the Animation group, BallAnimations.swift contains a few pre-built animations. At the moment, the ball animation is a bit unnatural looking — it’s just sitting there embedded into the ground. To liven things up, you’ll start off by making it roll around the scene.
Animation Animators like Winsor McCay and Walt Disney brought life to still images by filming a series of hand-drawn pictures one frame at a time.
Winsor McCay: Gertie the Dinosaur
raywenderlich.com
569
Metal by Tutorials
Chapter 23: Animation
This frame-by-frame animation was — and still is — very time consuming. With the rise of computer animation, artists can now create 3D models and record their positions at specific points in time. From there, the computer could interpolate, or tween, the values between those positions, making the animation process a lot less time consuming. But there is another option: procedural animation.
Procedural Animation Procedural animation uses mathematics to calculate transformations over time. In this chapter, you’ll first animate the ball using the sine function, just as you did earlier in Chapter 7, “The Fragment Function”, when you animated a quad with trigonometric functions. The started project contains a scene with a ball. To begin, you’ll create a structure that controls the ball’s animation. ➤ In the Game group, add a new Swift file named Beachball.swift, and add this: struct Beachball { var ball: Model var currentTime: Float = 0 init(model: Model) { self.ball = model ball.position.y = 1 }
}
mutating func update(deltaTime: Float) { currentTime += deltaTime }
Here, you initialize Beachball with the model reference, and create a method that GameScene will call every frame. (You’ll use the timer to animate your model over time.) ➤ Open GameScene.swift, and add a new property: lazy var beachball = Beachball(model: ball)
➤ Then, add the following code to the top of update(deltaTime:): beachball.update(deltaTime: deltaTime)
All of the ball’s movement and animation will now take place in Beachball. raywenderlich.com
570
Metal by Tutorials
Chapter 23: Animation
➤ Open Beachball.swift, and add the following code to the end of update(deltaTime:): ball.position.x = sin(currentTime)
This code updates the ball’s x position every frame by the sine of the accumulated current time. ➤ Build and run the app.
Side to side sine animation The ball now moves from side-to-side. Sine is useful for procedural animation. By changing the amplitude, period and frequency, you can create waves of motion — although, for a ball, that’s not very realistic. However, with some physics, you can add a little bounce to its movement.
Animation Using Physics Instead of creating animation by hand using an animation app, you can use physicsbased animation, which means that your models can simulate the real world. In this next exercise, you’re going to simulate only gravity and a collision. However, a full physics engine can simulate all sorts of effects, such as fluid dynamics, cloth and soft body (rag doll) dynamics. ➤ Create a new property in Beachball to track the ball’s velocity: var ballVelocity: Float = 0
raywenderlich.com
571
Metal by Tutorials
Chapter 23: Animation
➤ Remove the following code from update(deltaTime:): ball.position.x = sin(currentTime)
➤ In update(deltaTime:), set some constants for the individual physics that you’ll need for the simulation: let let let let let let
gravity: Float = 9.8 // meter / sec2 mass: Float = 0.05 acceleration = gravity / mass airFriction: Float = 0.2 bounciness: Float = 0.9 timeStep: Float = 1 / 600
gravity represents the acceleration of an object falling to Earth. If you’re simulating
gravity elsewhere in the universe, for example, Mars, this value would be different. Newton’s Second Law of Motion is F = ma or force = mass * acceleration. Rearranging the equation gives acceleration = force (gravity) / mass. The other constants describe the surroundings and properties of the ball. If this were a bowling ball, it would have a higher mass and less bounce. ➤ Add this at the end of update(deltaTime:): ballVelocity += (acceleration * timeStep) / airFriction ball.position.y -= ballVelocity * timeStep // collision with ground if ball.position.y = currentTime { return first.value } // 3 if currentTime >= lastKeyframe.time, !repeatAnimation { return lastKeyframe.value } }
This method returns the interpolated keyframe. Here’s the breakdown: 1. Ensure that there are translation keys in the array, otherwise, return a nil value. 2. If the first keyframe occurs on or after the time given, then return the first key value. The first frame of an animation clip should be at keyframe 0 to give a starting pose. 3. If the time given is greater than the last key time in the array, then check whether you should repeat the animation. If not, then return the last value. raywenderlich.com
577
Metal by Tutorials
Chapter 23: Animation
➤ Add the following code to the bottom of getTranslation(at:): // 1 currentTime = fmod(currentTime, lastKeyframe.time) // 2 let keyFramePairs = translations.indices.dropFirst().map { (previous: translations[$0 - 1], next: translations[$0]) } // 3 guard let (previousKey, nextKey) = (keyFramePairs.first { currentTime < $0.next.time }) else { return nil } // 4 let interpolant = (currentTime - previousKey.time) / (nextKey.time - previousKey.time) // 5 return simd_mix( previousKey.value, nextKey.value, float3(repeating: interpolant))
Going through this code: 1. Use the modulo operation to get the current time within the clip. 2. Create a new array of tuples containing the previous and next keys for all keyframes, except the first one. 3. Find the first tuple of previous and next keyframes where the current time is less than the next keyframe time. The current time will, therefore, be between the previous and next keyframe times. 4. Use the interpolation formula to get a value between 0 and 1 for the progress percentage between the previous and next keyframe times. 5. Use simd_mix to interpolate between the two keyframes. (interpolant must be a value between 0 and 1.) ➤ Open BallAnimations.swift, and uncomment ballTranslations. ballTranslations is an array of Keyframes with seven keys. The length of the clip
is two seconds. You can see this by looking at the key time of the last keyframe. In the x-axis, the ball will start off at position -1 and then move to position 1 at 0.35 seconds. It will hold its position until 1 second has passed, then return to -1 at 1.35 seconds. It will then hold its position until the end of the clip.
raywenderlich.com
578
Metal by Tutorials
Chapter 23: Animation
By changing the values in the array, you can speed up the throw and hold the ball for longer at either end. ➤ Open Beachball.swift, and replace update(deltaTime:) with: mutating func update(deltaTime: Float) { currentTime += deltaTime var animation = Animation() animation.translations = ballTranslations ball.position = animation.getTranslation(at: currentTime) ?? [0, 0, 0] ball.position.y += ball.size.y }
Here, you load the animation clip with the generated keyframe translations. Generally, you’ll want to load the animation clip outside of the update, but for the sake of simplicity, in this example, handling things within update(deltaTime:) is fine. You then extract the ball’s position from the animation clip for the current time. ➤ Build and run the app, and watch as creepy invisible hands toss your ball around.
Tossing the ball Note: Notice the trajectory of the ball on the y-axis. It currently goes up and down in diagonal straight lines. Better keyframing can fix this.
raywenderlich.com
579
Metal by Tutorials
Chapter 23: Animation
Euler Angle Rotations Now that you have the ball translating through the air, you probably want to rotate it as well. To express rotation of an object, you currently hold a float3 with rotation angles on x, y and z axes. These are known as Euler angles after the mathematician Leonhard Euler. Euler is the man behind Euler’s rotation theorem — a theorem which states that any rotation can be described using three rotation angles. This is OK for a single rotation, but interpolating between these three values doesn’t work in a way that you may think. ➤ To create a rotation matrix, you’ve been calling this function, hidden in the math library in Utility/MathLibrary.swift: init(rotation angle: float3) { let rotationX = float4x4(rotationX: angle.x) let rotationY = float4x4(rotationY: angle.y) let rotationZ = float4x4(rotationZ: angle.z) self = rotationX * rotationY * rotationZ }
Here, the final rotation matrix is made up of three rotation matrices multiplied in a particular order. This order is not set in stone and is one of six possible orders. Depending on the multiplication order, you’ll get a different rotation. Note: Sometimes, these rotations are referred to as Yaw-Pitch-Roll. You’ll see these names a lot in flight simulators, Depending on your frame of reference, if you’re using the y-axis as up and down (remember that’s not universal), then Yawing is about the y-axis, Pitching is about the x-axis and Rolling is about the z-axis. For static objects within one rendering engine, this is fine. The main problem comes with animation and interpolating these angles.
raywenderlich.com
580
Metal by Tutorials
Chapter 23: Animation
As you proceed through a rotation interpolation if two axes become aligned you get the terrifyingly named gimbal lock.
Gimbal lock means that you’ve lost one axis of rotation. Because the inner axis rotations build on the outer axis rotation, the two rotations overlap and cause odd interpolation.
Quaternions Multiplying x, y and z rotations without compelling a sequence on them is impossible unless you involve the fourth dimension. In 1843, Sir William Rowan Hamilton did just that: he inscribed his fundamental formula for quaternion multiplication on to a stone on a bridge in Dublin.
The formula uses four-dimensional vectors and complex numbers to describe rotations. The mathematics is complicated, but fortunately, you don’t have to understand how quaternions work to use them. The main benefit of quaternions are: • They interpolate correctly when using spherical linear interpolation (or slerp). • They never lose any axes of control. • They always take the shortest path between two rotations unless you specifically ask for the longest path. Note: If you’re interested in studying the internals of quaternions, references.markdown contains further reading.
raywenderlich.com
581
Metal by Tutorials
Chapter 23: Animation
You don’t have to write any complicated interpolation code, as simd has quaternion classes and methods that handle everything for you using simd_slerp(). The quaternions perform a spherical interpolation along the shortest path as shown in the following image.
Spherical interpolation Internally in simd, quaternions are vectors of four elements, but Apple suggests that you treat them as abstract mathematical objects rather than delving into internal storage. That lets you off the hook for learning that the last element of the quaternion is the real part, and the first three elements are the imaginary part. You’ll switch from using Euler rotations to using quaternions for your rotations. Taking advantage of simd conversion of quaternions to and from rotation matrices, this switch is almost effortless. ➤ In the Geometry group, open Transform.swift, and add this property to Transform: var quaternion = simd_quatf()
➤ In the extension where you define modelMatrix, change the definition of rotation to: let rotation = float4x4(quaternion)
Your Transforms will now support quaternions instead of Euler angles. You should also change rotation so that it updates the quaternion.
raywenderlich.com
582
Metal by Tutorials
Chapter 23: Animation
➤ In Transform, change the definition of rotation to: var rotation: float3 = [0, 0, 0] { didSet { let rotationMatrix = float4x4(rotation: rotation) quaternion = simd_quatf(rotationMatrix) } }
The quaternion value will now stay in sync when you set a model’s rotation. ➤ In the Transformable extension, add this: var quaternion: simd_quatf { get { transform.quaternion } set { transform.quaternion = newValue } }
With this syntactic sugar, when you refer to the model.transform.quaternion, you can now instead shorten it to model.quaternion. To animate using quaternion rotation, you’ll duplicate what you did for translations. ➤ Open Animation.swift, and add a new property to Animation: var rotations: [Keyframe] = []
This time the keyframes will be quaternion values. ➤ Duplicate getTranslation(at:) to a new method called getRotation(at:), that uses rotations and quaternions instead of translations and floats: func getRotation(at time: Float) -> simd_quatf? { guard let lastKeyframe = rotations.last else { return nil } var currentTime = time if let first = rotations.first, first.time >= currentTime { return first.value } if currentTime >= lastKeyframe.time, !repeatAnimation { return lastKeyframe.value } currentTime = fmod(currentTime, lastKeyframe.time) let keyFramePairs = rotations.indices.dropFirst().map { (previous: rotations[$0 - 1], next: rotations[$0]) }
raywenderlich.com
583
Metal by Tutorials
}
Chapter 23: Animation
guard let (previousKey, nextKey) = (keyFramePairs.first { currentTime < $0.next.time }) else { return nil } let interpolant = (currentTime - previousKey.time) / (nextKey.time - previousKey.time) return simd_slerp( previousKey.value, nextKey.value, interpolant)
Note that you change the interpolation function to use simd_slerp instead of simd_mix. This does the necessary spherical interpolation. ➤ Open BallAnimations.swift, and uncomment ballRotations. ballRotations is an array of rotation keyframes. The rotation starts out at 0, then rotates by 90º on the z-axis over several keyframes to a rotation of 0 at 0.35 seconds. The reason for rotating several times by 90º is because if you rotate from 0º to 360º, the shortest distance between those is 0º, so the ball won’t rotate at all.
➤ Open Beachball.swift, and replace update(deltaTime:) with this: mutating func update(deltaTime: Float) { currentTime += deltaTime var animation = Animation() animation.translations = ballTranslations animation.rotations = ballRotations ball.position = animation.getTranslation(at: currentTime) ?? float3(repeating: 0) ball.position.y += ball.size.y / 2 ball.quaternion = animation.getRotation(at: currentTime) ?? simd_quatf() }
You load the animation clip with both translation and rotation values. You then extract the values for position and quaternion for the given time.
raywenderlich.com
584
Metal by Tutorials
Chapter 23: Animation
➤ Build and run the app, and your ball moves back and forth with rotation.
The ball rotates as it moves If you need more complex animations, you’ll probably want to create the animation in a 3D app. The ball actually holds some transformation animation (made in Blender) in its USD file.
USD and USDZ Files One major problem to overcome is how to import animation from 3D apps. Model I/ O can import .obj files, but they only hold static information, not animation. USD is a format devised by Pixar, which can hold massive scenes with textures, animation and lighting information. There are various file extensions: • .usd: A Universal Scene Description (USD) file consists of assets or links to assets which allows multiple artists to work on the same scene. The file can contain mesh geometry, shading information, models, cameras and lighting. • .usdz: A single archive file that contains all the files - not just links - necessary for rendering a model. • .usda: This file is the USD file in text format. The ball included in this chapter’s project is in .usda format so that you can open the file with TextEdit and inspect the contents. • .usdc: This file is the USD file in binary format. Apple has adopted USDZ — the archive derivation of the USD format — as their preferred augmented reality 3D format. Maya 2022, Houdini 19.0 and Blender 3.0 can import and export USD formats.
raywenderlich.com
585
Metal by Tutorials
Chapter 23: Animation
Note: As at the time of writing, Blender 3.0 can’t export skeletal animation directly to USD. You can animate meshes, such as this ball, in Blender 3.0 and export the animation because the mesh is not attached to a skeleton. You’ll read more about skeletal animation in the following chapter. Apple also provides Reality Converter (https://apple.co/3H8FKWd), which allows you to convert from other formats, view and customize your USDZ files. Currently the supported formats are .obj, .fbx, .abc and .glTF. Sketchfab (http://sketchfab.com) is a major provider and showcase of 3D models. All of their downloadable models are available and converted to the USDZ format.
Animating Meshes The file beachball.usda holds translation and rotation animation, and Model I/O can extract this animation. There are several ways to approach initializing this information, and you’ll use the first in this chapter. Model I/O transform components don’t allow you to access the rotation and translation values directly, but provides you with a method that returns a transform matrix at a particular time. So for mesh transform animation you’ll extract the animation data for every frame of the animation during the model loading process. In the next chapter, when you work on skeletal animation, you’ll have access to joint rotation and translation, so you’ll load data only where there are keyframes, and use your interpolation methods to interpolate each frame. Note: When writing your own engine, you’ll have the choice to load this animation data up front for every frame, to match the transformation animation. You should consider the requirements of your game and what information your models hold. Generally it is more efficient to extract the loading code to a separate app which loads models and saves materials, textures and animation data into a more efficient format that matches your game engine. A good example of this asset pipeline is Apple’s video and sample code From Art to Engine with Model I/O from WWDC 2017 (https:// developer.apple.com/videos/play/wwdc2017/610/).
raywenderlich.com
586
Metal by Tutorials
Chapter 23: Animation
You’ll be running your game at a fixed fps - generally 60, and you’ll hold a transform matrix for every frame of animation. ➤ In the Game group, open GameController.swift, and add a new property: static var fps: Double = 0
➤ At the top of init(metalView:options:), add this: Self.fps = Double(metalView.preferredFramesPerSecond)
You’ll use fps as the standard frames per second for your app. You set fps right at the top of init(metalView:options:) because the models will use it when you initialize GameScene. Model I/O can hold transform information on all objects within the MDLAsset. For simplicity, you’ll hold a transform component on each Mesh, and just animate the transforms for the duration given by the asset. ➤ In the Animation group, create a new file named TransformComponent.swift, and replace the default code with: import ModelIO struct TransformComponent { let keyTransforms: [float4x4] let duration: Float var currentTransform: float4x4 = .identity }
You’ll hold all the transform matrices for each frame for the duration of the animation. For example, if the animation has a duration of 2.5 seconds at 60 frames per second, keyTransforms will have 150 elements. Later, you’ll update all of the Meshs’ currentTransform of every frame with the transform for the current frame taken from keyTransforms. ➤ Now, add the following to TransformComponent: init( transform: MDLTransformComponent, object: MDLObject, startTime: TimeInterval, endTime: TimeInterval ) { duration = Float(endTime - startTime) let timeStride = stride( from: startTime,
raywenderlich.com
587
Metal by Tutorials
}
Chapter 23: Animation
to: endTime, by: 1 / TimeInterval(GameController.fps)) keyTransforms = Array(timeStride).map { time in MDLTransform.globalTransform( with: object, atTime: time) }
This initializer receives an MDLTransformComponent from either an asset or a mesh and then create all the transform matrices for every frame for the duration of the animation. ➤ Add the following to TransformComponent: mutating func getCurrentTransform(at time: Float) { guard duration > 0 else { currentTransform = .identity return } let frame = Int(fmod(time, duration) * Float(GameController.fps)) if frame < keyTransforms.count { currentTransform = keyTransforms[frame] } else { currentTransform = keyTransforms.last ?? .identity } }
You retrieve a transform matrix at a particular, given, time. You calculate the current frame of the animation from the time. Using the floating point modulo operation fmod function, you can loop the animation. For example, if the animation is 2.5 seconds long, at 60 frames per second, that would mean there are 150 frames in the animation. If the current time is 5 seconds, that will be the last frame of the animation looped for a second time, and the current frame will be 150. You save the current transform on the transform component. You’ll use the transform to update the position of the mesh vertices shortly. The animation will need the start and end time from the asset. ➤ Open Mesh.swift, and add a new initializer: init( mdlMesh: MDLMesh, mtkMesh: MTKMesh, startTime: TimeInterval, endTime: TimeInterval
raywenderlich.com
588
Metal by Tutorials
Chapter 23: Animation
) { self.init(mdlMesh: mdlMesh, mtkMesh: mtkMesh) }
➤ Open Model.swift, and in init(name:), locate Mesh(mdlMesh: $0.0, mtkMesh: $0.1) inside the meshes assignment. ➤ Change Mesh(mdlMesh: $0.0, mtkMesh: $0.1) to: Mesh( mdlMesh: $0.0, mtkMesh: $0.1, startTime: asset.startTime, endTime: asset.endTime)
You use your new initializer in place of the old one. ➤ Back in Mesh.swift, add a new property to Mesh: var transform: TransformComponent?
➤ Add the following code to the end of init(mdlMesh:mtkMesh:startTime:endTime:): if let mdlMeshTransform = mdlMesh.transform { transform = TransformComponent( transform: mdlMeshTransform, object: mdlMesh, startTime: startTime, endTime: endTime) } else { transform = nil }
Now that you’ve set up the transform component with animation, you’ll be able to use it when rendering each frame. ➤ Open Model.swift, and add a new property to keep track of elapsed game time: var currentTime: Float = 0
➤ Change let meshes: [Mesh] to: var meshes: [Mesh]
You’ll update the meshes every frame, so you need write access.
raywenderlich.com
589
Metal by Tutorials
Chapter 23: Animation
➤ Add a new method to Model: func update(deltaTime: Float) { currentTime += deltaTime for i in 0.. 0) ? length(distances) // 3 : max(distances.x, distances.y); }
Going through the code: 1. Offset the current point coordinates by the given rectangle center. Then, get the symmetrical coordinates of the given point by using the abs() function, and calculate the signed distance to each of the two edges. 2. If those two distances are positive, then you’ll need to calculate the distance to the corner. 3. Otherwise, return the distance to the closer edge. Note: In this case, rectangle.size / 2 is the distance from the rectangle center to an edge, similar to what a radius is for a circle.
raywenderlich.com
696
Metal by Tutorials
Chapter 28: Advanced Shadows
Next, is a handy function that lets you subtract one shape from another. Think about Set Theory from back in your school days.
Shape subtraction Note: You can find out more about Set Theory here: https://en.wikipedia.org/ wiki/Complement_(set_theory)#Relative_complement ➤ Add this function to Shaders.metal before compute: float differenceOperator(float d0, float d1) { return max(d0, -d1); }
This code yields a value that can be used to calculate the difference result from the previous image, where the second shape is subtracted from the first. The result of this function is a signed distance to a compound shape boundary. It’ll only be negative when inside the first shape, but outside the second. ➤ Continue by adding this code to design a basic scene: float distanceToScene(float2 point) { // 1 Rectangle r1 = Rectangle{float2(0.0), float2(0.3)}; float d2r1 = distanceToRectangle(point, r1); // 2 Rectangle r2 = Rectangle{float2(0.05), float2(0.04)}; float2 mod = point - 0.1 * floor(point / 0.1); float d2r2 = distanceToRectangle(mod, r2); // 3 float diff = differenceOperator(d2r1, d2r2); return diff; }
raywenderlich.com
697
Metal by Tutorials
Chapter 28: Advanced Shadows
Going through the code: 1. Create a rectangle, and get the distance to it. 2. Create a second, smaller rectangle, and get the distance to it. The difference here is that the area is repeated every 0.1 points — which is a 10th of the size of the scene — using a modulo operation. See the note below. 3. Subtract the second repeated rectangle from the first rectangle, and return the resulting distance. Note: The fmod function in MSL uses trunc instead of floor, so you create a custom mod operator because you also want to use the negative values. You use the GLSL specification for mod which is x - y * floor(x/y). You need the modulus operator to draw many small rectangles mirrored with a distance of 0.1 from each other. Finally, use these functions to generate a shape that looks a bit like a fence or a trellis. ➤ At the end of compute, replace the color assignment with: float d2scene = distanceToScene(uv); bool inside = d2scene < 0.0; float4 color = inside ? float4(0.8,0.5,0.5,1.0) : float4(0.9,0.9,0.8,1.0);
➤ Run the playground:
The initial scene raywenderlich.com
698
Metal by Tutorials
Chapter 28: Advanced Shadows
For shadows to work, you need to: 1. Get the distance to the light. 2. Know the light direction. 3. Step in that direction until you either reach the light or hit an object. ➤ Open Shaders.metal, and above the last line in compute, add this: float2 lightPos = 2.8 * float2(sin(time), cos(time)); float dist2light = length(lightPos - uv); color *= max(0.3, 2.0 - dist2light);
First, you create a light at position lightPos, which you’ll animate just for fun using the timer uniform that you passed from the host code. Then, you get the distance from any given point to lightPos, and you color the pixel based on the distance from the light — but only if it’s not inside an object. You make the color lighter when closer to the light, and darker when further away with the max() function to avoid negative values for the brightness of the light. ➤ Run the playground.
A moving light Notice the moving light that appears at the corners as it circuits the scene.
raywenderlich.com
699
Metal by Tutorials
Chapter 28: Advanced Shadows
You just took care of the first two steps: light position and direction. Now it’s time to handle the third one: the shadow function. ➤ In Shaders.metal, add this above compute: float getShadow(float2 point, float2 lightPos) { // 1 float2 lightDir = lightPos - point; // 2 for (float lerp = 0; lerp < 1; lerp += 1 / 300.0) { // 3 float2 currentPoint = point + lightDir * lerp; // 4 float d2scene = distanceToScene(currentPoint); if (d2scene lightDist) { break; }
}
} return 1.0;
The shadow function is quite similar to that of hard shadows with a few modifications. You normalize the direction of the light, and then you keep updating the distance along the ray as you march along with it. You also reduce the number of steps to only 100. ➤ Replace the last line in compute with this: float s = shadow(ray, light); output.write(float4(col * l * s, 1.0), gid);
➤ Run the playground, and you’ll see the light casting shadows.
Light casting shadows Time to finally get some soft shadows in the scene. In real life, a shadow spreads out the farther it gets from an object. For example, where an object touches the floor, you get a sharp shadow; but farther away from the object, the shadow is more blurred. In other words, you start at some point on the floor, march toward the light, and have either a hit or a miss.
raywenderlich.com
709
Metal by Tutorials
Chapter 28: Advanced Shadows
Hard shadows are straightforward: you hit something, it’s in the shadow. Soft shadows have in-between stages. ➤ In Shaders.metal, replace the shadow function with this: // 1 float shadow(Ray ray, float k, Light l) { float3 lightDir = l.position - ray.origin; float lightDist = length(lightDir); lightDir = normalize(lightDir); // 2 float light = 1.0; float eps = 0.1; // 3 float distAlongRay = eps * 2.0; for (int i=0; i lightDist) { break; } } return max(light, 0.0); }
Going through the code, here are the differences from the previous shadow function: 1. Add an attenuator k as a function argument, which you’ll use to get intermediate values of light. 2. Start with a white light and a small value for eps. This is a variable that tells you how much wider the beam is as you go out into the scene. A thin beam means a sharp shadow while a wide beam means a soft shadow. 3. Start with a small distAlongRay, because otherwise, the surface at this point would shadow itself. 4. Compute the light by subtracting the distance from the beam width eps and then dividing by it. This gives you the percentage of beam covered. If you invert it (1 - beam width) you get the percentage of beam that’s in the light. Then, take the minimum of this new value and light to preserve the darkest shadow as you march along the ray.
raywenderlich.com
710
Metal by Tutorials
Chapter 28: Advanced Shadows
5. Move along the ray, and increase the beam width in proportion to the distance traveled and scaled by the attenuator k. 6. If you’re past the light, break out of the loop. Avoid negative values by returning the maximum between 0.0 and the value of light. Next, adapt the compute kernel code to work with the new shadow function. ➤ In compute, replace all of the lines after the one where you created the Ray object, with this: // 1 bool hit = false; for (int i = 0; i < 200; i++) { float dist = distToScene(ray); if (dist < 0.001) { hit = true; break; } ray.origin += ray.direction * dist; } // 2 col = float3(1.0); // 3 if (!hit) { col = float3(0.8, 0.5, 0.5); } else { float3 n = getNormal(ray); Light light = Light{float3(sin(time) * 10.0, 5.0, cos(time) * 10.0)}; float l = lighting(ray, n, light); float s = shadow(ray, 0.3, light); col = col * l * s; } // 4 Light light2 = Light{float3(0.0, 5.0, -15.0)}; float3 lightRay = normalize(light2.position - ray.origin); float fl = max(0.0, dot(getNormal(ray), lightRay) / 2.0); col = col + fl; output.write(float4(col, 1.0), gid);
Going through the code: 1. Add a Boolean that tells you whether or not you hit the object. If the distance to the scene is within 0.001, you have a hit. 2. Start with a default white color. This is important, because when you later multiply this color with the value of shadow and that of the light; white will never influence the result because of multiplying by 1.
raywenderlich.com
711
Metal by Tutorials
Chapter 28: Advanced Shadows
3. If there’s no hit, color everything in a nice sky color, otherwise determine the shadow value. 4. Add another fixed light source in front of the scene to see the shadows in greater detail. ➤ Run the playground, and you’ll see a beautiful combination of shadow tones.
Shadow Tones
Ambient Occlusion Ambient occlusion (AO) is a global shading technique, unlike the Phong local shading technique you learned about in Chapter 10, “Lighting Fundamentals”. AO is used to calculate how exposed each point in a scene is to ambient lighting which is determined by the neighboring geometry in the scene. AO is, however, a weak variant of global illumination. It looks like a scene on a rainy day and feels like a non-directional, diffuse shading effect. For hollow objects, AO makes the interior look darker because the light is even more occluded inside. As you move towards the edges of the object, it looks lighter and lighter. Only large objects are taken into consideration when computing the amount of ambient light, such as the sky, walls or any other objects that would normally be big enough to cast a shadow if they were lit. AO is usually a fragment post-processing technique. However, you are looking into it in this chapter because AO is a type of shadow. raywenderlich.com
712
Metal by Tutorials
Chapter 28: Advanced Shadows
In the following top-down image, you can see how the base of the curved wall is darker, as well as the base of the box.
➤ In the playground, open the Ambient Occlusion playground page. ➤ Run the playground, and you’ll see an scene with a ground plane and a light.
Ambient occlusion starter scene ➤ In the Resources folder, open Shaders.metal.
raywenderlich.com
713
Metal by Tutorials
Chapter 28: Advanced Shadows
➤ Add a new box object type below the other types: struct Box { float3 center; float size; };
➤ Next, add a new distance function for Box before distToScene: float distToBox(Ray r, Box b) { // 1 float3 d = abs(r.origin - b.center) - float3(b.size); // 2 return min(max(d.x, max(d.y, d.z)), 0.0) + length(max(d, 0.0)); }
Going through the code: 1. Offset the current ray origin by the center of the box. Then, get the symmetrical coordinates of the ray position by using the abs function. Offset the resulting distance d by the length of the box edge. 2. Get the distance to the farthest edge by using the max function, and then get the smaller value between 0 and the distance you just calculated. If the ray is inside the box, this value will be negative, so you need to add the larger length between 0 and d. ➤ Replace the return line in distToScene with this: // 1 Sphere s1 = Sphere{float3(0.0, 0.5, 0.0), 8.0}; Sphere s2 = Sphere{float3(0.0, 0.5, 0.0), 6.0}; Sphere s3 = Sphere{float3(10., -5., -10.), 15.0}; float d2s1 = distToSphere(r, s1); float d2s2 = distToSphere(r, s2); float d2s3 = distToSphere(r, s3); // 2 float dist = differenceOp(d2s1, d2s2); dist = differenceOp(dist, d2s3); // 3 Box b = Box{float3(1., 1., -4.), 1.}; float dtb = distToBox(r, b); dist = unionOp(dist, dtb); dist = unionOp(d2p, dist); return dist;
raywenderlich.com
714
Metal by Tutorials
Chapter 28: Advanced Shadows
Going through the code: 1. Draw two spheres with the same center: one with a radius of 8, and one with a radius of 6. Draw a third, larger sphere at a different location. 2. Subtract the second sphere from the first, resulting in a hollow, thicker sphere. Subtract the third sphere from the hollow sphere to make a cross-section through it. 3. Add a box and a plane to complete the scene. ➤ Run the Ambient Occlusion playground, and you’ll see the hollow sphere, box and plane.
The ambient occlusion scene Time to work on the ambient occlusion code. ➤ In Shaders.metal, create a skeleton function above compute: float ao(float3 pos, float3 n) { return n.y * 0.5 + 0.5; }
This function uses the normal’s Y component for light and adds 0.5 to it. This makes it look like there’s light directly above. ➤ Inside compute, since there are no shadows anymore, replace this line: col = col * l * s;
raywenderlich.com
715
Metal by Tutorials
Chapter 28: Advanced Shadows
➤ With this: float o = ao(ray.origin, n); col = col * o;
➤ At the end of compute, remove this: col = col + fl;
➤ Remove the unused variable definitions light, l, s, light2, lightRay and f1. ➤ Run the playground, and you’ll see the same scene as before — this time without the shadows, and the surfaces pointing upward are brighter.
This is a good start, but it’s not how ambient occlusion should look — at least not yet. Ambient means that the light does not come from a well-defined light source, but is rather lighting coming from other objects in the scene, all contributing to the general scene light. Occlusion means how much of the ambient light is blocked. The main idea about ambient occlusion is to use the point where the ray hits the surface and look at what’s around it. If there’s an object anywhere around it that will block most of the light nearby, that area will be dark. If there’s nothing around it, then the area is well lit. For in-between situations, you need more precision about how much light was occluded. raywenderlich.com
716
Metal by Tutorials
Chapter 28: Advanced Shadows
Cone tracing is a technique that uses a cone instead of a ray. If the cone intersects an object, you don’t just have a simple true/false result. You can find out how much of the cone the object covers at that point. Tracing a cone might be a challenge though. You could make a cone using spheres aligned along a line, small at one end and big at the other end. This would be a good cone approximation to use. Since you’re doubling the sphere size at each step, that means you travel out from the surface very fast, so you need fewer iterations. That also gives you a nice wide cone. ➤ In Shaders.metal, replace the contents of the ao function with this: // 1 float eps = 0.01; // 2 pos += n * eps * 2.0; // 3 float occlusion = 0.0; for (float i = 1.0; i < 10.0; i++) { // 4 float d = distToScene(Ray{pos, float3(0)}); float coneWidth = 2.0 * eps; // 5 float occlusionAmount = max(coneWidth - d, 0.); // 6 float occlusionFactor = occlusionAmount / coneWidth; // 7 occlusionFactor *= 1.0 - (i / 10.0); // 8 occlusion = max(occlusion, occlusionFactor); // 9 eps *= 2.0; pos += n * eps; } // 10 return max(0.0, 1.0 - occlusion);
Going through the code: 1. eps is both the cone radius and the distance from the surface. 2. Move away a bit to prevent hitting surfaces you’re moving away from. 3. occlusion is initially zero (the scene is white).
raywenderlich.com
717
Metal by Tutorials
Chapter 28: Advanced Shadows
4. Get the scene distance, and double the cone radius so you know how much of the cone is occluded. 5. Eliminate negative values for the light by using the max function. 6. Get the amount, or ratio, of occlusion scaled by the cone width. 7. Set a lower impact for more distant occluders; the iteration count provides this. 8. Preserve the highest occlusion value so far. 9. Double eps, and then move along the normal by that distance. 10. Return a value that represents how much light reaches this point. ➤ Run the playground, and you’ll see ambient occlusion in all of its splendor.
Ambient occlusion It would be useful to have a camera that moves around the scene. All it needs is a position, a ray that can be used as the camera’s direction and a divergence factor which shows how much the ray spreads.
raywenderlich.com
718
Metal by Tutorials
Chapter 28: Advanced Shadows
➤ In Shaders.metal, add a new structure with the other types: struct Camera { float3 position; Ray ray{float3(0), float3(0)}; float rayDivergence; };
Here, you’re setting up a camera using the look-at technique. This requires the camera to have a forward direction, an up direction and a left vector. If you’re using a right-handed coordinate system, it’s a right vector instead. ➤ Add this function before compute: Camera setupCam(float3 pos, float3 target, float fov, float2 uv, int x) { // 1 uv *= fov; // 2 float3 cw = normalize(target - pos); // 3 float3 cp = float3(0.0, 1.0, 0.0); // 4 float3 cu = normalize(cross(cw, cp)); // 5 float3 cv = normalize(cross(cu, cw)); // 6 Ray ray = Ray{pos, normalize(uv.x * cu + uv.y * cv + 0.5 * cw)}; // 7 Camera cam = Camera{pos, ray, fov / float(x)}; return cam; }
Going through the code: 1. Multiply the uv coordinates by the field of view. 2. Calculate a unit direction vector cw for the camera’s forward direction. 3. The left vector will point orthogonally from an up and forward vector. cp is a temporary up vector. 4. The cross product gives you an orthogonal direction, so calculate the left vector cu using the forward and up vectors.
raywenderlich.com
719
Metal by Tutorials
Chapter 28: Advanced Shadows
5. Calculate the correct up vector cv using the left and forward vectors. 6. Create a ray at the given origin with the direction determined by the left vector cu for the X-axis, by the up vector cv for the Y-axis and by the forward vector cw for the Z-axis. 7. Create a camera using the ray you created above. The third parameter is the ray divergence and represents the width of the cone. x is the number of pixels inside the field of view (e.g., if the view is 60 degrees wide and contains 60 pixels, each pixel is 1 degree). This is useful for speeding up the SDF when far away, and also for antialiasing. ➤ To initialize the camera, replace this line in compute: Ray ray = Ray{float3(0., 4., -12), normalize(float3(uv, 1.))};
➤ With this: float3 camPos = float3(sin(time) * 10., 3., cos(time) * 10.); Camera cam = setupCam(camPos, float3(0), 1.25, uv, width); Ray ray = cam.ray;
Run the playground, and as the camera circles the scene, you can view the ambient occlusion from all directions.
Camera circling the scene
raywenderlich.com
720
Metal by Tutorials
Chapter 28: Advanced Shadows
Key Points • Raymarching produces better quality shadows than rasterized shadows. • Hard shadows are not realistic, as there are generally multiple light sources in the real world. • Soft shadows give better transitions between areas in shadow and not. • Ambient occlusion does not depend on scene lighting, but on neighboring geometry. The closer geometry is to an area, the darker the area is.
Where to Go From Here? In addition to the shadow types you learned in this chapter, there are other shadow techniques such as Screen Space Ambient Occlusion and Shadow Volumes. If you’re interested in learning about these, review references.markdown in the resources folder for this chapter.
raywenderlich.com
721
29
Chapter 29: Advanced Lighting
As you’ve progressed through this book, you’ve encountered various lighting and reflection models: • In Chapter 10, “Lighting Fundamentals” you started with the Phong reflection model which defines light as a sum of three distinct components: ambient light, diffuse light and specular light. • In Chapter 11, “Maps & Materials” you briefly looked at physically based rendering and the Fresnel effect. • In Chapter 21, “Image-Based Lighting” you implemented skybox-based reflection and image-based lighting, and you used a Bidirectional Reflectance Distribution Function (BRDF) look-up table. In this chapter, you’ll learn about global illumination and the famous rendering equation that defines it.
raywenderlich.com
722
Metal by Tutorials
Chapter 29: Advanced Lighting
While reflection is possible using the local illumination techniques you’ve seen so far, advanced effects — like refraction, subsurface scattering, total internal reflection, caustics and color bleeding — are only possible with global illumination.
A real-life example of global illumination and caustics You’ll start by examining the rendering equation. From there, you’ll move on to raymarched reflection and refraction.
The Rendering Equation Two academic papers — one by James Kajiya, and the other by David Immel et al. — introduced the rendering equation in 1986. In its raw form, this equation might look intimidating:
The rendering equation The rendering equation is based on the law of conservation of energy, and in simple terms, it translates to an equilibrium equation where the sum of all source lights must equal the sum of all destination lights: incoming light + emitted light = transmitted light + outgoing light
raywenderlich.com
723
Metal by Tutorials
Chapter 29: Advanced Lighting
If you rearrange the terms of the equilibrium equation, you get the most basic form of the rendering equation: outgoing light = emitted light + incoming light - transmitted light
The incoming light - transmitted light part of the equation is subject to recursion because of multiple light bounces at that point. That recursion process translates to an integral over a unit hemisphere that’s centered on the normal vector at the point and which contains all the possible values for the negative direction of the incoming light. Although the rendering equation might be a bit intimidating, think of it like this: All the light leaving an object is what remains from all the lights coming into the object after some of them were transmitted through the object. The transmitted light can be either absorbed by the surface of the object (material), changing its color; or scattered through the object, which leads to a range of interesting optical effects such as refraction, subsurface scattering, total internal reflection, caustics and so on.
Reflection Reflection, like any other optical phenomenon, has an equation that depends on three things: the incoming light vector, the incident angle and the normal vector for the surface.
Reflection The law of reflection states that the angle at which an incident light hits the surface of an object will be the same as the angle of the light that’s being reflected off the normal. But enough with the theory for now. Time to have some fun coding! raywenderlich.com
724
Metal by Tutorials
Chapter 29: Advanced Lighting
Getting Started ➤ In Xcode, open the starter playground named AdvancedLighting, and select the 1. Reflection playground page. ➤ Run the playground:
The starter playground The code in this playground should look familiar to you because you’ve seen it in the two previous chapters. You’ll continue as before, writing code in the metal shader for the relevant page. You’ll start by adding a checkerboard pattern to the plane, getting it to reflect onto the sphere.
Drawing a Checkerboard Pattern To draw a pattern on the plane, you first need to have a way of identifying objects within the scene by comparing their proximity to the camera based on distance. ➤ Inside the Resources folder for this playground page, open Shaders.metal, and create two constants to identify the two objects in the scene: constant float PlaneObj = 0.0; constant float SphereObj = 1.0;
raywenderlich.com
725
Metal by Tutorials
Chapter 29: Advanced Lighting
➤ In distToScene, after this line: float dts = distToSphere(r, s);
➤ Add this: float object = (dtp > dts) ? SphereObj : PlaneObj;
Here, you check whether the distance to the plane is greater than the distance to the sphere, and you hold the result in object. ➤ Replace return dist; with: return float2(dist, object);
You include both distance and object information in the function return. ➤ Run the playground to verify the image hasn’t changed. In Shaders.metal, the kernel function compute is where you’re raymarching the scene. In a for loop, you iterate over a considerable number of samples and update the ray color until you attain enough precision. It’s in this code block that you’ll draw the pattern on the plane. ➤ In compute, inside the for loop, locate: float2 dist = distToScene(cam.ray); distToScene returns the closest object in dist.y.
➤ Immediately after that line, add this: float closestObject = dist.y;
➤ After hit = true;, add this: // 1 if (closestObject == PlaneObj) { // 2 float2 pos = cam.ray.origin.xz; pos *= 0.1; // 3 pos = floor(fmod(pos, 2.0)); float check = mod(pos.x + pos.y, 2.0); // 4 col *= check * 0.5 + 0.5; }
raywenderlich.com
726
Metal by Tutorials
Chapter 29: Advanced Lighting
Going through the code: 1. Build the checkerboard if the selected object is the plane. 2. Get the position of the camera ray in the horizontal XZ plane since you’re interested in intersecting the floor plane only. 3. Create squares. You first alternate between 0s and 1s on both X and Z axes by applying the modulo operator. At this point, you have a series of pairs containing either 0s or 1s or both. Next, add the two values together from each pair, and apply the modulo operator again. If the sum is 2, roll it back to 0; otherwise, it will be 1. 4. Apply color. Initially, it’s a solid white color. Multiply by 0.5 to tone it down, and add 0.5 back, so you can have both white and grey squares. You have a compile error for the missing mod function. However, before adding the missing function, take a moment to understand why you need to implement a separate modulo operation. The fmod function, as implemented by the Metal Shading Language, performs a truncated division where the remainder will have the same sign as the numerator: fmod = numerator - denominator * trunc(numerator / denominator)
A second approach, missing from MSL, is known as floored division, where the remainder has the same sign as the denominator: mod = numerator - denominator * floor(numerator / denominator)
These two approaches could have entirely different results. When calculating pos, the values need to alternate between 0s and 1s, so taking the floor of the truncated division is enough. However, when you add the two coordinates to determine the check value on the next line, you need to take the floor of their sum. ➤ Add the new floored division function above compute: float mod(float x, float y) { return x - y * floor(x / y); }
raywenderlich.com
727
Metal by Tutorials
Chapter 29: Advanced Lighting
➤ Run the playground, and you’ll see your checkerboard pattern.
The checkerboard pattern All you need to do now is reflect the checkerboard onto the sphere. ➤ In Shaders.metal, add a new reflection function above compute: Camera reflectRay(Camera cam, float3 n, float eps) { cam.ray.origin += n * eps; cam.ray.dir = reflect(cam.ray.dir, n); return cam; }
The MSL standard library provides a reflect() function that takes the incoming ray direction and intersecting surface normal as arguments and returns the outgoing (reflected) ray direction. The reflectRay function is a convenience that returns the Camera object, not just its ray direction. ➤ In compute, after the if (closestObject == PlaneObj) block, but inside the if (dist.x < eps) block, add this: float3 normal = getNormal(cam.ray); cam = reflectRay(cam, normal, eps);
This code gets the normal where the camera ray intersects an object, and reflects it at that point. You move the ray away from the surface, along the normal and not along the ray direction as you might have expected because that could be almost parallel to the surface. You only move away a small distance eps that’s precise enough to tell you when there’s not a hit anymore. raywenderlich.com
728
Metal by Tutorials
Chapter 29: Advanced Lighting
The bigger eps is, the fewer steps you need to hit the surface, so the faster your tracing is — but it’s also less accurate. You can play with various values for eps until you find a balance between precision and speed that satisfies your needs. ➤ Run the playground:
Reflecting the checkerboard You’re successfully reflecting the checkerboard onto the sphere, but the sky is not reflecting. This is because in the starter code you used the Boolean hit, which stops and breaks out of the loop when the ray first hits any object. That’s not true anymore, because now you need the ray to keep hitting objects for reflection. ➤ Open Shaders.metal, and in compute, replace this code: if (!hit) { col = mix(float3(.8, .8, .4), float3(.4, .4, 1.), cam.ray.dir.y); } else { float3 n = getNormal(cam.ray); float o = ao(cam.ray.origin, n); col = col * o; }
➤ With: col *= mix(float3(0.8, 0.8, 0.4), float3(0.4, 0.4, 1.0), cam.ray.dir.y);
raywenderlich.com
729
Metal by Tutorials
Chapter 29: Advanced Lighting
You add the sky color to the scene color globally, not just when a ray failed to hit an object in the scene. You can optionally remove the ao function and the two lines in compute where hit appears since you’re not using them anymore. ➤ Run the playground, and you’ll see the sky is now also reflected on the sphere and the floor.
Reflecting the sky You can spin the camera a little bit to make the reflection look more interesting. ➤ In Shaders.metal, add this parameter to compute: constant float &time [[buffer(0)]]
➤ And replace this line: float3 camPos = float3(15.0, 7.0, 0.0);
➤ With this: float3 camPos = float3(sin(time) * 15.0, sin(time) * 5.0 + 7.0, cos(time) * 15.0);
raywenderlich.com
730
Metal by Tutorials
Chapter 29: Advanced Lighting
➤ Run the playground, and you’ll see the same image but now nicely animated.
Animated reflections
Refraction The law of refraction is a little more complicated than simple equality between the incoming and outgoing light vector angles.
Refraction
raywenderlich.com
731
Metal by Tutorials
Chapter 29: Advanced Lighting
Refraction is dictated by Snell’s law, which states that the ratio of angles equals the reversed ratio of indices of refraction:
Snell's law The index of refraction (IOR) is a constant that defines how fast light propagates through various media. IOR is defined as the speed of light in a vacuum divided by the phase velocity of light in that particular medium. Note: There are published lists with IOR values for various media but the ones that interest us here are that of air (IOR = 1) and that of water (IOR = 1.33). See https://en.wikipedia.org/wiki/List_of_refractive_indices for more details. To find the angle for the refracted light vector through water, for example, all you need to know is the incoming light vector angle, which you can use from the reflected light vector. Then, you can divide that by the IOR for water since IOR for air is 1 and does not affect the calculation: sin(theta2) = sin(theta1) / 1.33
Time for some more coding. ➤ Open the 2. Refraction playground page, and run it. The code is the same as the previous section, so you’ll see the same animation. ➤ Inside the Resources folder for this playground page, open Shaders.metal. You first need to have a way of knowing when the ray is inside the sphere, as you only do refraction in that case. ➤ In compute, add the following code before the for loop: bool inside = false;
raywenderlich.com
732
Metal by Tutorials
Chapter 29: Advanced Lighting
In the first part of this chapter, you identified objects, so you know when the ray hits the sphere. This means that you can change the sign of the distance depending on whether the ray enters the sphere, or leaves it. As you know from previous chapters, a negative distance means you’re inside the object you are sending your ray towards. ➤ Locate: float2 dist = distToScene(cam.ray);
➤ And, add this line below it: dist.x *= inside ? -1.0 : 1.0;
This adjusts the x value to reflect whether you are inside the sphere or not. Next, you need to adjust the normals. ➤ Delete this line: float3 normal = getNormal(cam.ray);
➤ Then, locate this line: if (dist.x < eps) {
➤ After you find it, add the normal definition back into the code right below it: float3 normal = getNormal(cam.ray) * (inside ? -1.0 : 1.0);
You now have a normal that points outward when outside the sphere and inward when you’re inside the sphere. ➤ Move the following line so that it is inside the inner if block because you only want the plane to be reflective from now on: cam = reflectRay(cam, normal, eps);
raywenderlich.com
733
Metal by Tutorials
Chapter 29: Advanced Lighting
➤ After the inner if block, add an else block where you make the sphere refractive: // 1 else if (closestObject == SphereObj) { inside = !inside; // 2 float ior = inside ? 1.0 / 1.33 : 1.33; cam = refractRay(cam, normal, eps, ior); }
Going through the code: 1. Check whether you’re inside the sphere. On the first intersection, the ray is now inside the sphere, so turn inside to true and do the refraction. On the second intersection, the ray now leaves the sphere, so turn inside to false, and refraction no longer occurs. 2. Set the index of refraction (IOR) based on the ray direction. IOR for water is 1.33. The ray is first going air-to-water, then it’s going water-to-air in which case the IOR becomes 1 / 1.33. ➤ To fix the compile error currently being shown by Xcode, add this missing function above compute: Camera refractRay(Camera cam, float3 n, float eps, float ior) { cam.ray.origin -= n * eps * 2.0; cam.ray.dir = refract(cam.ray.dir, n, ior); return cam; }
The MSL standard library also provides a refract() function, so you’re just building a convenience function around it. You subtract the distance this time because the ray is inside the sphere. You also double the eps value, which is enough to move far enough inside to avoid another collision. If eps were still the old value, the ray might stop and consider it another collision with the object since eps was defined precisely for this purpose: precision. Doubling it will make the ray pass just over the point that was already a collision point before.
raywenderlich.com
734
Metal by Tutorials
Chapter 29: Advanced Lighting
➤ Run the playground, and you’ll see the sphere now being refractive.
Raytraced Water It’s relatively straightforward to create a cheap, fake water-like effect on the sphere. ➤ Open the 3. Water playground page, and run it. You’ll see the same animation from the previous section. ➤ Inside the Resources folder for this playground page, open Shaders.metal. ➤ In the distToScene function, locate: float object = (dtp > dts) ? SphereObj : PlaneObj;
➤ And add this code afterward: if (object == SphereObj) { // 1 float3 pos = r.origin; pos += float3(sin(pos.y * 5.0), sin(pos.z * 5.0), sin(pos.x * 5.0)) * 0.05; // 2 Ray ray = Ray{pos, r.dir}; dts = distToSphere(ray, s); }
raywenderlich.com
735
Metal by Tutorials
Chapter 29: Advanced Lighting
Going through the code: 1. Get the ray’s current position, and apply ripples to the surface of the sphere by altering all three coordinates. Use 0.05 to attenuate the altering. A value of 0.001 is not large enough to make an impact, while 0.01 is too much of an impact. 2. Construct a new ray using the altered position as the new ray origin while preserving the old direction. Calculate the distance to the sphere using this new ray. ➤ In compute, replace: cam.ray.origin += cam.ray.dir * dist.x;
➤ With: cam.ray.origin += cam.ray.dir * dist.x * 0.5;
You added an attenuation factor of 0.5 to make the animation slower but more precise. ➤ Run the playground, and you’ll see a water-like ball.
raywenderlich.com
736
Metal by Tutorials
Chapter 29: Advanced Lighting
Key Points • The rendering equation is the gold standard of realistic rendering. It describes conservation of energy where the sum of incoming light must equal outgoing light. • Reflection depends on the angle of the incoming light and the surface normal. • Refraction takes into account the medium’s index of refraction, which defines the speed at which light travels through the medium.
Where to Go From Here? If you want to explore more about water rendering, the references.markdown file for this chapter contains links to interesting articles. This concludes the series of chapters using raymarching. But don’t worry, rendering is far from over. In the next chapters, you’ll dip your toes into image processing and learn about using Metal for accelerating ray tracing.
raywenderlich.com
737
30
Chapter 30: Metal Performance Shaders
In Chapter 19, “Tessellation & Terrains”, you had a brief taste of using the Metal Performance Shaders (MPS) framework. MPS consists of low-level, fine-tuned, highperformance kernels that run off the shelf with minimal configuration. In this chapter, you’ll dive a bit deeper into the world of MPS.
raywenderlich.com
738
Metal by Tutorials
Chapter 30: Metal Performance Shaders
Overview The MPS kernels make use of data-parallel primitives that are written in such a way that they can take advantage of each GPU family’s characteristics. The developer doesn’t have to care about which GPU the code needs to run on, because the MPS kernels have multiple versions of the same kernel written for every GPU you might use. Think of MPS kernels as convenient black boxes that work efficiently and seamlessly with your command buffer. Simply give it the desired effect, a source and destination resource (buffer or texture), and then encode GPU commands on the fly!
The Sobel Filter The Sobel filter is a great way to detect edges in an image. ➤ In the starter folder for this chapter, open and run sobel.playground, and you’ll see such an effect (left: original image, right: Sobel filter applied):
The Sobel filter Assuming you’ve already created a device object, a command queue, a command buffer and a texture object for the input image, there are only a few lines of code to apply the Sobel filter to your input image: let shader = MPSImageSobel(device: device) shader.encode( commandBuffer: commandBuffer, sourceTexture: inputImage, destinationTexture: drawable.texture)
MPS kernels are not thread-safe, so it’s not recommended to run the same kernel on multiple threads that are all writing to the same command buffer concurrently. Moreover, you should always allocate your kernel to only one device, because thekernel’s init(device:)method could allocate resources that are held by the current device and might not be available to another device. raywenderlich.com
739
Metal by Tutorials
Chapter 30: Metal Performance Shaders
Note: MPS kernels provide a copy(with:device:) method that allows them to be copied to another device. The MPS framework serves a variety of purposes: • Image processing • Matrix / vector mathematics • Neural Networks Note: The Metal Performance Shaders for ray tracing has been replaced by a newer ray tracing API. In this chapter, you’ll mainly focus on image processing, with a brief look at matrix mathematics.
Image Processing There are a few dozen MPS image filters, among the most common being: • Morphological (area min, area max, dilate, erode). • Convolution (median, box, tent, Gaussian blur, Sobel, Laplacian, and so on). • Histogram (histogram, histogram equalization, histogram specification). • Threshold (binary, binary inverse, to zero, to zero inverse, and so on). • Manipulation (conversion, Lanczos scale, bilinear scale, transpose). Note: For a complete list of MPS kernels, consult Apple’s official Image Filters page (https://developer.apple.com/documentation/metalperformanceshaders/ image_filters). If you want to create your own filters, you can get inspired from Gimp’s list of filters (https://docs.gimp.org/en/filters.html).
raywenderlich.com
740
Metal by Tutorials
Chapter 30: Metal Performance Shaders
An RGB image is nothing but a matrix with numbers between 0 and 255 (when using 8-bit color channels). A greyscale image only has one such matrix because it only has one channel. For color images, there are three separate RGB channels (red, green, blue), so consequently three matrices, one for each channel. One of the most important operations in image processing is convolution, which is an operation consisting of applying a much smaller matrix, often called the kernel, to the original image and obtaining the desired effect as a result. As an example, this matrix is used for obtaining Gaussian blur:
A Gaussian blur matrix Note: You can find a list of common kernels at https://en.wikipedia.org/wiki/ Kernel_(image_processing) Here’s a diagram showing how the kernel is applied to two pixels:
Convolution And, here’s how the result shown in green was calculated: (6 * 1 4 * 2 9 * 1
+ + +
7 * 2 9 * 4 2 * 2
raywenderlich.com
+ + +
3 * 1 + 8 * 2 + 3 * 1) / 16 = 6
741
Metal by Tutorials
Chapter 30: Metal Performance Shaders
In this example, 16 represents the weight, and it’s not a randomly chosen number — it’s the sum of the numbers from the convolution kernel matrix. When you need to apply convolution to the border pixels, you can apply padding to the input matrix. For example, when the center of a 3×3 convolution kernel overlaps with the image element at position (0, 0), the image matrix needs to be padded with an extra row and extra column of zeros. However, if the bottom rightmost element of the convolution kernel overlaps with the image element at position (0, 0), the image matrix needs to be padded with two extra rows and two extra columns of zeros.
Convolution applied to border pixels Applying the convolution kernel to an image matrix padded with an extra row, and extra column of zeros, gives you this calculation for a 3×3 kernel: (0 * 1 0 * 2 0 * 1
+ + +
0 * 2 6 * 4 4 * 2
+ + +
0 * 1 + 7 * 2 + 9 * 1) / 9 = 6
In this case, the weight, 9, is the sum of the numbers from the convolution kernel matrix that are affecting only the non-zero numbers from the image matrix (4 + 2 + 2 + 1). Something like this is straightforward, but when you need to work with larger kernels and multiple images that need convolution, this task might become non-trivial. You know how to calculate by hand and apply convolution to an image — and at the very beginning of the chapter you saw an MPS filter on an image too — but how about using MPS in your engine? What if you were to implement bloom in your engine? Guess what? You are going to do just that next!
raywenderlich.com
742
Metal by Tutorials
Chapter 30: Metal Performance Shaders
Bloom The bloom effect is quite a spectacular one. It amplifies the brightness of objects in the scene and makes them look luminous as if they’re emitting light themselves. Below is a diagram that gives you an overview of how to achieve bloom:
The bloom effect Here are the steps you’re going to take: • Render the entire scene to a texture. • Apply a threshold filter to this texture. This will amplify the lighter parts of the image, making them brighter. • Apply a blur filter to the threshold texture from the previous step. • Combine this texture with the initial scene for the final image.
raywenderlich.com
743
Metal by Tutorials
Chapter 30: Metal Performance Shaders
The Starter Project ➤ In Xcode, open the starter project for this chapter and build and run the app.
The scene is the same as Chapter 25, “Managing Resources”, complete with marching skeletons. The starter project has a new Post Processing group, already set up with Bloom.swift and Outline.swift. In these files, you’ll add some post processing filters to the final image. In the Game group, open Renderer.swift, and locate // Post processing in draw(scene:in:). Renderer initializes bloom and outline, and depending on the option that the user chooses, runs a post processing effect. The project currently renders to the view’s drawable texture. Instead of sending this texture straight to the screen, you’ll intercept and use the drawable texture as input to the threshold filter. Currently, when you choose Bloom or Outline in the app, a print statement goes to the debug console.
raywenderlich.com
744
Metal by Tutorials
Chapter 30: Metal Performance Shaders
Setting Up the Textures ➤ In the Post Processing group, open Bloom.swift, and import the MPS framework: import MetalPerformanceShaders
➤ Define two textures at the top of Bloom: var outputTexture: MTLTexture! var finalTexture: MTLTexture! outputTexture will hold the blurred threshold texture, and finalTexture will hold
this texture combined with the initial render. Renderer calls resize(view:size:) from mtkView(_:drawableSizeWillChange:)
whenever the window resizes, so this is where you’ll create the textures. ➤ Add the following code to resize(view:size:): outputTexture = TextureController.makeTexture( size: size, pixelFormat: view.colorPixelFormat, label: "Output Texture", usage: [.shaderRead, .shaderWrite]) finalTexture = TextureController.makeTexture( size: size, pixelFormat: view.colorPixelFormat, label: "Final Texture", usage: [.shaderRead, .shaderWrite])
You create the two textures. Later, you’ll use them as the destinations of MPS filters, so you mark them as writeable.
Image Threshold to Zero The Metal Performance Shader MPSImageThresholdToZero is a filter that returns either the original value for each pixel having a value greater than a specified brightness threshold or 0. It uses the following test: destinationColor = sourceColor > thresholdValue ? sourceColor : 0
This filter has the effect of making darker areas black, while the lighter areas retain their original color value.
raywenderlich.com
745
Metal by Tutorials
Chapter 30: Metal Performance Shaders
➤ In postProcess(view:commandBuffer:), replace print("Post processing: Bloom") with: guard let drawableTexture = view.currentDrawable?.texture else { return } let brightness = MPSImageThresholdToZero( device: Renderer.device, thresholdValue: 0.5, linearGrayColorTransform: nil) brightness.label = "MPS brightness" brightness.encode( commandBuffer: commandBuffer, sourceTexture: drawableTexture, destinationTexture: outputTexture)
Here, you create an MPS kernel to create a threshold texture with a custom brightness threshold set to 0.5 — where all pixels with less than a color value of 0.5 will be turned to black. The input texture is the view’s drawable texture, which contains the current rendered scene. The result of the filter will go into outputTexture. Internally, the MPS kernel samples from drawableTexture, so you have to set the view’s drawable to be used for read/write operations. ➤ Open Renderer.swift, and add this to the end of init(metalView:options:): metalView.framebufferOnly = false
Metal optimizes drawable as much as possible, so setting framebufferOnly to false will affect performance slightly. To be able to see the result of this filter, you’ll blit outputTexture back into drawable.texture. You should be familiar with the blit command encoder from when you copied textures to the heap in Chapter 25, “Managing Resources”
The Blit Command Encoder ➤ Open Bloom.swift, and add this to the end of postProcess(view:commandBuffer:): finalTexture = outputTexture guard let blitEncoder = commandBuffer.makeBlitCommandEncoder() else { return } let origin = MTLOrigin(x: 0, y: 0, z: 0) let size = MTLSize( width: drawableTexture.width,
raywenderlich.com
746
Metal by Tutorials
Chapter 30: Metal Performance Shaders
height: drawableTexture.height, depth: 1) blitEncoder.copy( from: finalTexture, sourceSlice: 0, sourceLevel: 0, sourceOrigin: origin, sourceSize: size, to: drawableTexture, destinationSlice: 0, destinationLevel: 0, destinationOrigin: origin) blitEncoder.endEncoding()
This copies the output of the previous filter into the drawable texture. Unlike when you copied the textures to the heap, you don’t have to worry about slices and mipmap levels here. ➤ Build and run the app, and select Bloom. You’ll now see the texture filtered to grayscale.
Brightness threshold Notice how only some of the rendered areas were bright enough to make it to this texture. These white areas are all you need to create the bloom effect. Before using this texture, you need to add a little fuzziness to it which will make the model edges appear to glow. You can accomplish this with another MPS kernel: the Gaussian blur.
Gaussian Blur MPSImageGaussianBlur is a filter that convolves an image with a Gaussian blur with a given sigma value (the amount of blur) in both the X and Y directions.
raywenderlich.com
747
Metal by Tutorials
Chapter 30: Metal Performance Shaders
➤ Still in Bloom.swift, in postProcess(view:commandBuffer:), add the following prior to finalTexture = outputTexture: let blur = MPSImageGaussianBlur( device: Renderer.device, sigma: 9.0) blur.label = "MPS blur" blur.encode( commandBuffer: commandBuffer, inPlaceTexture: &outputTexture, fallbackCopyAllocator: nil)
In-place encoding is a special type of encoding where, behind the curtains, the input texture is processed, stored to a temporary texture and finally written back to the input texture without the need for you to designate an output texture. The fallbackCopyAllocator argument allows you to provide a closure where you can specify what will happen to the input image should the in-place normal encoding fail. ➤ Build and run the app, and choose Bloom. You’ll see the result of this blur.
Brightness and blur
Image Add The final part of creating the bloom effect is to add the pixels of this blurred image to the pixels of the original render. MPSImageArithmetic, as its name suggests, performs arithmetic on image pixels. Subclasses of this include MPSImageAdd, MPSImageSubtract, MPSImageMultiply and MPSImageDivide.
raywenderlich.com
748
Metal by Tutorials
Chapter 30: Metal Performance Shaders
Adding the rendered scene pixels to the lighter blurred pixels will brighten up those parts of the scene. In contrast, adding them to the black pixels will leave them unchanged. ➤ In postProcess(view:commandBuffer:), replace finalTexture = outputTexture with this: let add = MPSImageAdd(device: Renderer.device) add.encode( commandBuffer: commandBuffer, primaryTexture: drawableTexture, secondaryTexture: outputTexture, destinationTexture: finalTexture)
This code adds the drawable texture to outputTexture and places the result in finalTexture. ➤ Build and run the app, and choose Bloom. You’ll see this:
Brightness, blur and add The entire scene is bathed in a mystical glow. Awesome bloom! ➤ In postProcess(view:commandBuffer:), change the initialization of brightness to: let brightness = MPSImageThresholdToZero( device: Renderer.device, thresholdValue: 0.8, linearGrayColorTransform: nil)
Fewer pixels will make it through the brightness filter.
raywenderlich.com
749
Metal by Tutorials
Chapter 30: Metal Performance Shaders
➤ Build and run the app, and choose Bloom.
Glowing skeletons Because the skeletons are the brightest objects in the scene, they appear to glow spookily.
Matrix / Vector Mathematics You learned in the previous section how you could quickly apply a series of MPS filters that are provided by the framework. But what if you wanted to make your own filters? You can create your own filter functions and calculate convolutions yourself. However, when working with large matrices and vectors, the amount of math involved might get overwhelming. The MPS framework not only provides image processing capability, but it also provides functionality for decomposition and factorizing matrices, solving systems of equation and multiplying matrices and/or vectors on the GPU in a fast, highly parallelized fashion. You’re going to look at matrix multiplication next. ➤ Create a new empty playground for macOS named matrix.playground. ➤ Replace the code with: import MetalPerformanceShaders guard let device = MTLCreateSystemDefaultDevice(), let commandQueue = device.makeCommandQueue() else { fatalError() } let size = 4 let count = size * size
raywenderlich.com
750
Metal by Tutorials
Chapter 30: Metal Performance Shaders
guard let commandBuffer = commandQueue.makeCommandBuffer() else { fatalError() } commandBuffer.commit() commandBuffer.waitUntilCompleted()
This code creates a Metal device, command queue, command buffer and adds a couple of constants you’ll need later. ➤ Above the line where you create the command buffer, add a new method that lets you create MPS matrices: func createMPSMatrix(withRepeatingValue: Float) -> MPSMatrix { // 1 let rowBytes = MPSMatrixDescriptor.rowBytes( forColumns: size, dataType: .float32) // 2 let array = [Float]( repeating: withRepeatingValue, count: count) // 3 guard let buffer = device.makeBuffer( bytes: array, length: size * rowBytes, options: []) else { fatalError() } // 4 let matrixDescriptor = MPSMatrixDescriptor( rows: size, columns: size, rowBytes: rowBytes, dataType: .float32) }
return MPSMatrix(buffer: buffer, descriptor: matrixDescriptor)
Going through the code: 1. Retrieve the optimal number of bytes between one row and the next. Whereas simd matrices expect column-major order, MPSMatrix uses row-major order. 2. Create a new array and populate it with the value provided as an argument to this method. 3. Create a new buffer with the data from this array. 4. Create a matrix descriptor; then create the MPS matrix using this descriptor and return it.
raywenderlich.com
751
Metal by Tutorials
Chapter 30: Metal Performance Shaders
Use this new method to create and populate three matrices. You’ll multiply A and B together, and you’ll place the result in C. ➤ Add the following code just before creating the command buffer: let A = createMPSMatrix(withRepeatingValue: 3) let B = createMPSMatrix(withRepeatingValue: 2) let C = createMPSMatrix(withRepeatingValue: 1)
➤ Add this code to create a MPS matrix multiplication kernel: let multiplicationKernel = MPSMatrixMultiplication( device: device, transposeLeft: false, transposeRight: false, resultRows: size, resultColumns: size, interiorColumns: size, alpha: 1.0, beta: 0.0)
➤ Below the line where you create the command buffer, add the following code to encode the kernel: multiplicationKernel.encode( commandBuffer: commandBuffer, leftMatrix: A, rightMatrix: B, resultMatrix: C)
You multiply A and B together, and you place the result in C. ➤ At the very end of the playground, add this code to read C: // 1 let contents = C.data.contents() let pointer = contents.bindMemory( to: Float.self, capacity: count) // 2 (0.. Forward Render Pass, select useHeap.
The heap textures The indirect resources are the textures currently in the heap. One flaw in the app, is that all the barrel textures are loaded for each and every barrel. Because you render multiple barrels, they should be instanced, which is one way of fixing the problem, but there is another. When the app loads the textures for the model barrel.usdz in TextureController, the textures aren’t allocated a name in the file, so each texture is given a unique UUID as the file name. However, if you load the model using the obj file format, the mtl file holds the file name, so TextureController is able to only load the texture for the first model.
raywenderlich.com
770
Metal by Tutorials
Chapter 31: Performance Optimization
➤ In the Game group, open GameScene.swift, and in init(), locate where you initialize barrels. ➤ Change the file name from barrel.usdz to: barrel.obj
➤ Build and run the app, and you’ll see that this already improves your frame rate. ➤ Capture the GPU workload and under Command Buffer > Forward Render Pass, select useHeap.
A reduced heap The size of the heap is now significantly reduced, contributing to a substantial performance gain. Remember to optimize the simple things first, because you may discover that you need no further optimization. You’re in control of your engine. When you design your model loading process, ensure that the model structure fits your app. To get the best performance, you shouldn’t be loading obj or usdz files at all. You should be loading all files from a file format that best suits your app’s API. For further information about how you can do this, watch Apple’s WWDC video From Art to Engine With Model I/O (https:// apple.co/3K948Is).
CPU-GPU Synchronization Managing dynamic data can be a little tricky. Take the case of Uniforms. You update uniforms usually once per frame on the CPU. That means that the GPU should wait until the CPU has finished writing the buffer before it can read the buffer. Instead of halting the GPU’s processing, you can simply have a pool of reusable buffers. raywenderlich.com
771
Metal by Tutorials
Chapter 31: Performance Optimization
Triple Buffering Triple buffering is a well-known technique in the realm of synchronization. The idea is to use three buffers at a time. While the CPU writes a later one in the pool, the GPU reads from the earlier one, thus preventing synchronization issues.
You might ask, why three and not just two or a dozen? With only two buffers, there’s a high risk that the CPU will try to write the first buffer again before the GPU finished reading it even once. With too many buffers, there’s a high risk of performance issues. ➤ Open Renderer.swift, and replace var uniforms = Uniforms() with: static let buffersInFlight = 3 var uniforms = [Uniforms]( repeating: Uniforms(), count: buffersInFlight) var currentUniformIndex = 0
Here, you replace the uniforms variable with an array of three buffers and define an index to keep track of the current buffer in use. ➤ In updateUniforms(scene:), replace this code: uniforms.projectionMatrix = scene.camera.projectionMatrix uniforms.viewMatrix = scene.camera.viewMatrix uniforms.shadowProjectionMatrix = shadowCamera.projectionMatrix uniforms.shadowViewMatrix = shadowMatrix
➤ With: uniforms[currentUniformIndex].projectionMatrix =
raywenderlich.com
772
Metal by Tutorials
Chapter 31: Performance Optimization
scene.camera.projectionMatrix uniforms[currentUniformIndex].viewMatrix = scene.camera.viewMatrix uniforms[currentUniformIndex].shadowProjectionMatrix = shadowCamera.projectionMatrix uniforms[currentUniformIndex].shadowViewMatrix = shadowMatrix currentUniformIndex = (currentUniformIndex + 1) % Self.buffersInFlight
Here, you adapt the update method to include the new uniforms array and make the index loop around always taking the values 0, 1 and 2. ➤ In draw(scene:in:), before updateUniforms(scene: scene), add this: let uniforms = uniforms[currentUniformIndex]
➤ Build and run the app.
Result of triple buffering Your app shows the same scene as before. There is, however, some bad news. The CPU can write to uniforms at any time and the GPU can read from it. There’s no synchronization to ensure the correct uniform buffer is being read. This is known as resource contention and involves conflicts, known as race conditions, over accessing shared resources by both the CPU and GPU. This can cause unexpected results, such as animation glitches. In the image below, the CPU is ready to start writing the first buffer again. However, that would require the GPU to have finished reading it, which is not the case here.
raywenderlich.com
773
Metal by Tutorials
Chapter 31: Performance Optimization
The following example shows two uniform buffers available:
Resource Contention What you need here is a way to delay the CPU writing until the GPU has finished reading it.
A naive approach is to block the CPU until the command buffer has finished executing. ➤ Still in Renderer.swift, add this to the end of draw(scene:in:): commandBuffer.waitUntilCompleted()
➤ Build and run the app. You’re now sure that the CPU thread is successfully being blocked, so the CPU and GPU are not fighting over uniforms. However, the frame rate has gone way down, and the skeleton’s animation is very jerky.
raywenderlich.com
774
Metal by Tutorials
Chapter 31: Performance Optimization
Semaphores A more performant way, is the use of a synchronization primitive known as a semaphore, which is a convenient way of keeping count of the available resources — your triple buffer in this case. Here’s how a semaphore works: • Initialize it to a maximum value that represents the number of resources in your pool (3 buffers here). • Inside the draw call the thread tells the CPU to wait until a resource is available and if one is, it takes it and decrements the semaphore value by one. • If there are no more available resources, the current thread is blocked until the semaphore has at least one resource available. • When a thread finishes using the resource, it’ll signal the semaphore by increasing its value and by releasing the hold on the resource. Time to put this theory into practice. ➤ At the top of Renderer, add this new property: var semaphore: DispatchSemaphore
➤ In init(metalView:options:), add this before super.init(): semaphore = DispatchSemaphore(value: Self.buffersInFlight)
➤ Add this at the top of draw(scene:in:): _ = semaphore.wait(timeout: .distantFuture)
➤ At the end of draw(scene:in:), but before committing the command buffer, add this: commandBuffer.addCompletedHandler { _ in self.semaphore.signal() }
➤ At the end of draw(scene:in:), remove: commandBuffer.waitUntilCompleted()
raywenderlich.com
775
Metal by Tutorials
Chapter 31: Performance Optimization
➤ Build and run the app again, making sure everything still renders fine as before. Your frame rate should be back to what it was before. The frame now renders more accurately, without fighting over resources.
Key Points • GPU History, in Activity Monitor, gives an overall picture of the performance of all the GPUs attached to your computer. • The GPU Report in Xcode shows you the frames per second that your app achieves. This should be 60 FPS for smooth running. • Capture the GPU workload for insight into what’s happening on the GPU. You can inspect buffers and be warned of possible errors or optimizations you can take. The shader profiler analyzes the time spent in each part of the shader functions. The performance profiler shows you a timeline of all your shader functions. • GPU counters show statistics and timings for every possible GPU function you can think of. • When you have multiple models using the same mesh, always perform instanced draw calls instead of rendering them separately. • Textures can have a huge effect on performance. Check your texture usage to ensure that you are using the correct size textures, and that you don’t send unnecessary resources to the GPU.
raywenderlich.com
776
32
Chapter 32: Best Practices
When you want to squeeze the very last ounce of performance from your app, you should always remember to follow a golden set of best practices, which are categorized into three major parts: general performance, memory bandwidth and memory footprint. This chapter will guide you through all three.
raywenderlich.com
777
Metal by Tutorials
Chapter 32: Best Practices
General Performance Best Practices The next five best practices are general and apply to the entire pipeline.
Choose the Right Resolution The game or app UI should be at native or close to native resolution so that the UI will always look crisp no matter the display size. Also, it is recommended (albeit not mandatory) that all resources have the same resolution. You can check the resolutions in the GPU Debugger on the dependency graph. Below is a partial view of the dependency graph from the multi-pass render in Chapter 14, “Deferred Rendering”:
The dependency graph Notice the size of the shadow pass render target. For crisper shadows, you should have a large texture, but you should consider the performance trade-offs of each image resolution and carefully choose the scenario that best fits your app needs.
Minimize Non-Opaque Overdraw Ideally, you’ll want to only draw each pixel once, which means you’ll want only one fragment shader process per pixel. If you were to draw the skybox before rendering models, your skybox texture would cover the whole render target texture. Drawing the models would then overdraw the skybox fragments with the model fragments. This is why you draw opaque meshes from front to back.
raywenderlich.com
778
Metal by Tutorials
Chapter 32: Best Practices
Submit GPU Work Early You can reduce latency and improve the responsiveness of your renderer by making sure all of the off-screen GPU work is done early and is not waiting for the on-screen part to start. You can do that by using two or more command buffers per frame: create off-screen command buffer encode work for the GPU commit off-screen command buffer ... get the drawable create on-screen command buffer encode work for the GPU present the drawable commit on-screen command buffer
Create the off-screen command buffer(s) and commit the work to the GPU as early as possible. Get the drawable as late as possible in the frame, and then have a final command buffer that only contains the on-screen work.
Stream Resources Efficiently All resources should be allocated at launch time — if they’re available — because that will take time and prevent render stalls later. If you need to allocate resources at runtime because the renderer streams them, you should make sure you do that from a dedicated thread. You can see resource allocations in Instruments, in a Metal System Trace, under the GPU ➤ Allocation track:
You can see here that there are a few allocations, but all at launch time. If there were allocations at runtime, you would notice them later on that track and identify potential stalls because of them.
raywenderlich.com
779
Metal by Tutorials
Chapter 32: Best Practices
Design for Sustained Performance You should test your renderer under a serious thermal state. This can improve the overall thermals of the device, as well as the stability and responsiveness of your renderer. Xcode now lets you see and change the thermal state in the Devices window from Window ▸ Devices and Simulators.
You can also use Xcode’s Energy Impact gauge to verify the thermal state that the device is running at:
Memory Bandwidth Best Practices Since memory transfers for render targets and textures are costly, the next six best practices are targeted to memory bandwidth and how to use shared and tiled memory more efficiently.
raywenderlich.com
780
Metal by Tutorials
Chapter 32: Best Practices
Compress Texture Assets Compressing textures is very important because sampling large textures may be inefficient. For that reason, you should generate mipmaps for textures that can be minified. You should also compress large textures to accommodate the memory bandwidth needs. There are various compression formats available. For example, for older devices, you could use PVRTC, and for newer devices, you could use ASTC. Review Chapter 8, “Textures”, for how to create mipmaps and change texture formats in the asset catalog. With the frame captured, you can use the Metal Memory Viewer to verify compression format, mipmap status and size. You can change which columns are displayed by right-clicking the column heading:
Some textures, such as render targets, cannot be compressed ahead of time, so you’ll have to do it at runtime instead. The good news is the A12 GPU and newer supports lossless texture compression, which allows the GPU to compress textures for faster access. raywenderlich.com
781
Metal by Tutorials
Chapter 32: Best Practices
Optimize for Faster GPU Access You should configure your textures correctly to use the appropriate storage mode depending on the use case. Use the private storage mode so only the GPU has access to the texture data, allowing optimization of the contents: textureDescriptor.storageMode = .private textureDescriptor.usage = [ .shaderRead, .renderTarget ] let texture = device.makeTexture(descriptor: textureDescriptor)
You shouldn’t set any unnecessary usage flags, such as unknown, shaderWrite or pixelView, because they may disable compression. Shared textures that can be accessed by the CPU as well as the GPU should explicitly be optimized after any CPU update on their data: textureDescriptor.storageMode = .shared textureDescriptor.usage = .shaderRead let texture = device.makeTexture(descriptor: textureDescriptor) // update texture data texture.replace( region: region, mipmapLevel: 0, withBytes: bytes, bytesPerRow: bytesPerRow) let blitCommandEncoder = commandBuffer.makeBlitCommandEncoder() blitCommandEncoder .optimizeContentsForGPUAccess(texture: texture) blitCommandEncoder.endEncoding()
Again, the Metal Memory Viewer shows you the storage mode and usage flag for all textures, along with noticing which ones are compressed textures already, as in the previous image.
Choose the Right Pixel Format Choosing the correct pixel format is crucial. Not only will larger pixel formats use more bandwidth, but the sampling rate also depends on the pixel format. You should try to avoid using pixel formats with unnecessary channels and also try to lower precision whenever possible. You’ve generally been using the RGBA8Unorm pixel format in this book. However, when you needed greater accuracy for the G-Buffer in Chapter 14, “Deferred Rendering”, you used a 16-bit pixel format. Again, you can use the Metal Memory Viewer to see the pixel formats for textures.
raywenderlich.com
782
Metal by Tutorials
Chapter 32: Best Practices
Optimize Load and Store Actions Load and store actions for render targets can also affect bandwidth. If you have a suboptimal configuration of your pipelines caused by unnecessary load/store actions, you might create false dependencies. An example of optimized configuration would be as follows: renderPassDescriptor.colorAttachments[0].loadAction = .clear renderPassDescriptor.colorAttachments[0].storeAction = .dontCare
In this case, you’re configuring a color attachment to be transient, which means you do not want to load or store anything from it. You can verify the current actions set on render targets in the Dependency Viewer.
Redundant store action As you can see, there is an exclamation point that suggests that you should not store the last render target.
Optimize Multi-Sampled Textures iOS devices have very fast multi-sampled render targets (MSAA) because they resolve from Tile Memory so it is best practice to consider MSAA over native resolution. Also, make sure not to load or store the MSAA texture and set its storage mode to memoryless: textureDescriptor.textureType = .type2DMultisample
raywenderlich.com
783
Metal by Tutorials
Chapter 32: Best Practices
textureDescriptor.sampleCount = 4 textureDescriptor.storageMode = .memoryless let msaaTexture = device.makeTexture(descriptor: textureDescriptor) renderPassDesc.colorAttachments[0].texture = msaaTexture renderPassDesc.colorAttachments[0].loadAction = .clear renderPassDesc.colorAttachments[0].storeAction = .multisampleResolve
The dependency graph will, again, help you see the current status set for load/store actions.
Leverage Tile Memory Metal provides access to Tile Memory for several features such as programmable blending, image blocks and tile shaders. Deferred shading requires storing the GBuffer in a first pass and then sampling from its textures in the second lighting pass where the final color accumulates into a render target. This is very bandwidth-heavy. iOS allows fragment shaders to access pixel data directly from Tile Memory in order to leverage programmable blending. This means that you can store the G-Buffer data on Tile Memory, and all the light accumulation shaders can access it within the same render pass. The four G-Buffer attachments are fully transient, and only the final color and depth are stored, so it’s very efficient.
Memory Footprint Best Practices Use Memoryless Render Targets As mentioned previously, you should be using memoryless storage mode for all transient render targets that do not need a memory allocation, that is, are not loaded from or stored to memory: textureDescriptor.storageMode = .memoryless textureDescriptor.usage = [.shaderRead, .renderTarget] // for each G-Buffer texture textureDescriptor.pixelFormat = gBufferPixelFormats[i] gBufferTextures[i] = device.makeTexture(descriptor: textureDescriptor) renderPassDescriptor.colorAttachments[i].texture = gBufferTextures[i] renderPassDescriptor.colorAttachments[i].loadAction = .clear renderPassDescriptor.colorAttachments[i].storeAction = .dontCare
raywenderlich.com
784
Metal by Tutorials
Chapter 32: Best Practices
You’ll be able to see the change immediately in the dependency graph.
Avoid Loading Unused Assets Loading all the assets into memory will increase the memory footprint, so you should consider the memory and performance trade-off and only load all the assets that you know will be used. The GPU frame capture Memory Viewer will show you any unused resources.
Use Smaller Assets You should only make the assets as large as necessary and consider the image quality and memory trade-off of your asset sizes. Make sure that both textures and meshes are compressed. You may want to only load the smaller mipmap levels of your textures or use lower level of detail meshes for distant objects.
Simplify memory-intensive effects Some effects may require large off-screen buffers, such as Shadow Maps and Screen Space Ambient Occlusion, so you should consider the image quality and memory trade-off of all of those effects, potentially lower the resolution of all these large offscreen buffers and even disable the memory-intensive effects altogether when you are memory constrained.
Use Metal Resource Heaps Rendering a frame may require a lot of intermediate memory, especially if your game becomes more complex in the post-process pipeline, so it is very important to use Metal Resource Heaps for those effects and alias as much of that memory as possible. For example, you may want to reutilize the memory for resources that have no dependencies, such as those for Depth of Field or Screen-Space Ambient Occlusion. Another advanced concept is that of purgeable memory. Purgeable memory has three states: non-volatile (when data should not be discarded), volatile (data can be discarded even when the resource may be needed) and empty (data has been discarded). Volatile and empty allocations do not count towards the application’s memory footprint because the system can either reclaim that memory at some point or has already reclaimed it in the past.
raywenderlich.com
785
Metal by Tutorials
Chapter 32: Best Practices
Mark Resources as Volatile Temporary resources may become a large part of the memory footprint and Metal will allow you to set the purgeable state of all the resources explicitly. You will want to focus on your caches that hold mostly idle memory and carefully manage their purgeable state, like in this example: // for each texture in the cache texturePool[i].setPurgeableState(.volatile) // later on... if (texturePool[i].setPurgeableState(.nonVolatile) == .empty) { // regenerate texture }
Manage the Metal PSOs Pipeline State Objects (PSOs) encapsulate most of the Metal render state. You create them using a descriptor that contains vertex and fragment functions as well as other state descriptors. All of these will get compiled into the final Metal PSO. Metal allows your application to load most of the rendering state upfront, improving the performance over OpenGL. However, if you have limited memory, make sure not to hold on to PSO references that you don’t need anymore. Also, don’t hold on to Metal function references after you have created the PSO cache because they are not needed to render; they are only needed to create new PSOs. Note: Apple has written a Metal Best Practices guide (https:// developer.apple.com/library/archive/documentation/3DDrawing/Conceptual/ MTLBestPracticesGuide/index.html) that provides great advice for optimizing your app.
raywenderlich.com
786
Metal by Tutorials
Chapter 32: Best Practices
Where to Go From Here? Getting the last ounce of performance out of your app is paramount. You’ve had a taste of examining CPU and GPU performance using Xcode, but to go further, you’ll need to use Instruments with Apple’s Instruments documentation (https:// help.apple.com/instruments/mac/10.0/). Over the years, at every WWDC since Metal was introduced, Apple has produced some excellent WWDC videos describing Metal best practices and optimization techniques. Go to https://developer.apple.com/videos/graphics-and-games/metal/ and watch as many as you can, as often as you can. Congratulations on completing the book! The world of Computer Graphics is vast and as complex as you want to make it. But now that you have the basics of Metal learned, even though current internet resources are few, you should be able to learn techniques described with other APIs such as OpenGL, Vulkan and DirectX. If you’re keen to learn more, check out some of these great books: Real-Time Rendering, Fourth Edition, by Tomas Akenine-Möller, Eric Haines, Naty Hoffman, Angelo Pesce, Michał Iwanicki, and Sébastien Hillaire - http:// www.realtimerendering.com Physically Based Rendering, Third Edition, by Matt Pharr, Wenzel Jakob and Greg Humphreys - https://www.pbrt.org Computer Graphics: Principles and Practice, Third Edition, by John F. Hughes, Andries van Dam, Morgan McGuire, David F. Sklar, James D. Foley, Steven K. Feiner, Kurt Akeley - https://www.amazon.com/Computer-Graphics-Principles-Practice-3rd/ dp/0321399528 Fundamentals of Computer Graphics, Fourth Edition, by Steve Marschner and Peter Shirley - https://www.amazon.com/Fundamentals-Computer-Graphics-SteveMarschner/dp/1482229390 Game Coding Complete, Fourth Edition, by Mike McShaffry and David Graham https://www.mcshaffry.com/GameCode/
raywenderlich.com
787
33 Conclusion
Thank you again for purchasing Metal by Tutorials. If you have any questions or comments as you continue to develop for Metal, please stop by our forums at https:// forums.raywenderlich.com. Your continued support is what makes the tutorials, books, videos and other things we do at raywenderlich.com possible. We truly appreciate it. Best of luck in all your development and game-making adventures, – Caroline Begbie (Author), Marius Horga (Author), Adrian Strahan (Tech Editor) and Tammy Coron (FPE). The Metal by Tutorials team
raywenderlich.com
788