Media Creation Curriculum in the AI Era #6 — Shooting — Capturing the Materiality That AI Cannot Replace

Updated: 2026-05*

1. Introduction

This essay is a Q&A between the author and Claude AI about a redesign of the author’s media production course. As the sixth installment in the “Media Creation Curriculum in the AI Era” series, it takes camera-based shooting — the act of pointing a physical instrument at the physical world to record an image — as its subject, and surveys the available expressions and methods.

Installment #1 used Andrew Price’s argument as a starting point to set the overall direction for “Visual Practicum I/II.” Installment #2 examined a node-based syllabus built entirely on ComfyCloud. Installment #3 considered a short-film curriculum centered on generative-video AI. Installments #1–#3 all belong to a single family — the time-based generative AI lineage that starts from prompts and models and produces “video that lives inside the screen.” Installment #4 and #5 then turned to media programming and physical-space staging built on TouchDesigner. That family is real-time generative.

The shooting practices covered here sit outside both of those families. The subject is territory that neither generative-video AI nor real-time generative systems like TouchDesigner can replace: the act of pointing a camera — a physical instrument — at the physical world to record time and light. The act is time-based, but its principle is utterly different from AI generation. It does not “generate” video; it “records” it, by being present at a real site, choosing equipment, configuring it, and pressing the shutter.

The aim of this essay is to organize that territory into three lineages — compositional shooting, special-purpose shooting, and time manipulation — and to take a single overview of the expressions and methods that gain value precisely because we are in the AI era. The course-side structure (mapping onto 180-min × 15 sessions, evaluation design, operational notes) will be handled in a subsequent essay. This piece is positioned as the prerequisite — “a reference for what shooting can do.”

1.1 Reference sites

References

1.2 What this essay covers

Positioning shooting: shooting as the other pole of time-based production, contrasted with AI generation and real-time generative systems — materiality, uniqueness, equipment-dependence, embodiment
Compositional shooting: camerawork as a thinking practice. SNS-native shooting by futa.729s, cinematic shooting by Blake Ridder, theory of cinematographic camerawork
Special-purpose shooting: macro / microscope, motion control (Edelkrone), Tilt-Shift / Small Planet, long exposure / light trails
Time manipulation: theory of high-frame-rate shooting, equipment selection, smartphone slow motion by aaa_tsushi, time manipulation in Premiere / After Effects (Time Remapping / Optical Flow / Twixtor)
The significance of shooting in the AI era

2. Positioning shooting — what shooting means in the AI era

Shooting is the act of pointing a camera — a physical instrument — at the physical world to record time and light. The most classical family of techniques in moving-image production lives here, with roots that long predate AI generation. It stands on a principle independent from both the AI generation handled in installments #1–#3 and the media programming handled in #4–#5.

This chapter lays out why shooting remains an important family of techniques even now, at a moment when AI generation is developing rapidly, and introduces the three lineages this essay covers (compositional shooting, special-purpose shooting, time manipulation).

2.1 Four properties of shooting

Placed alongside AI generation, four properties of shooting come into focus. They are also the grounds for shooting’s distinctiveness from AI generation.

Materiality: The subject exists in the physical world. The light hitting the lens, the image recorded on the sensor, the physical characteristics of the lens and sensor behind it, the haze in the air, the spectral distribution of the light — every one of these follows the behavior of matter.
Uniqueness: Only the combination of light that existed at the instant the shutter fired is recorded. The image is the single point at which subject, light, and the shooter’s body happened to intersect.
Equipment-dependence: The focal length, aperture, shutter speed, sensor dynamic range, frame-rate ceiling, and other physical characteristics of the equipment are written directly into the picture. The same scene becomes a different picture depending on the equipment.
Embodiment: The shooter’s body is on site. Position, angle, motion, breathing, the timing of each judgment — all of these are inscribed in the picture. Even on a fixed tripod, the bodily judgment that decided the composition remains.

These four properties cannot, in principle, be reproduced by AI generation. AI generation has no lens, no sensor, no light at the site, no shooter’s body. It has only the model’s internal representation and the prompts given to it. What it produces may “look the part,” but holds none of the four properties. Shooting’s distinctiveness lies here.

2.2 How shooting’s significance shifts in the AI era

What matters here is that, with AI generation now widely available, shooting’s significance grows rather than diminishes. There are three reasons.

AI generation is hitting visible ceilings: Generative systems have exploded since 2023, but they remain weak at physical consistency (objects not passing through each other, light staying coherent, shadows extending correctly, and so on). Recent surveys on arXiv and elsewhere show that visual photorealism and physical understanding are different things. Shot material has that consistency from the start.
Shot material becomes input for AI generation: In pipelines like i2v (image-to-video), v2v (video-to-video), and ControlNet, shot material is the starting point for generation. The design-sheet approach by Yamamoto covered in #3, and the Higgsfield Canvas node flow, all combine live action with generation. Shooting becomes the fuel for generation.
Scarcity as primary material: Unlike AI footage, which can be generated indefinitely from prompts, shot material only exists if the shooter was at the site. Scarcity of the source material pushes its relative value up when placed beside AI generation.

To revisit the argument in installment #1 from Andrew Price: what remains in the AI era is judgment, and the bearer of that judgment is the high-agency individual. Shooting is a dense cluster of judgments — subject selection, position, equipment choice, composition, timing, each one a judgment after the last. Shooting is the practice of judgment in the AI era, and it is also the place where that judgment becomes visible.

2.3 Three lineages of shooting covered here

This essay treats shooting in three lineages.

Compositional shooting (Chapter 3): Focuses on the side of camerawork and composition that is assembled in the head — framing, the path of the eye, direction of motion, shot design that anticipates editing. Ordinary camerawork (pan, tilt, dolly, etc.) appears here.
Special-purpose shooting (Chapter 4): Techniques for capturing territory invisible to ordinary human vision. Macro and microscopic shooting, motion control, Tilt-Shift / Small Planet, long exposure / light trails. The equipment-dependence and material side comes to the front in this lineage.
Time manipulation (Chapter 5): Techniques that exceed ordinary perception by operating on the time axis. Slow motion, hyperlapse, time interpolation in Premiere / After Effects.

The thread running through all three is “how to produce textures that AI generation cannot reproduce.” Compositional shooting relies on the shooter’s judgment; special-purpose shooting on the materiality of the equipment; time manipulation on the time axis itself. Each addresses a domain that AI generation is bad at.

3. Compositional shooting — camerawork as a thinking practice

Shooting is the act of operating equipment, but it is also — before the equipment is raised — the act of judging what to shoot, from where, and how. This essay calls that latter side “compositional shooting.” The same camera and the same subject, paired with different compositional judgments, produce different pictures. This chapter organizes the logic and practice of compositional shooting starting from two contrasting cases — SNS-native composition (futa.729s) and narrative cinema composition (Blake Ridder) — and extends the discussion out to cinematographic theory.

3.1 What compositional shooting is

Compositional shooting handles, per-shot, judgments like the following.

Framing: what enters the frame and what stays out. How the foreground / mid-ground / background layers are constructed.
Composition: rule of thirds, leading lines, symmetry, negative space — how each is deployed.
Path of the gaze: where the audience’s eye first lands, how it moves, where it stops.
Camera position: height, distance, angle. The physical relationship to the subject.
Camera motion: stationary, pan, dolly, handheld. Speed and direction of motion.
Lens choice: angle of view and compression by focal length, bokeh quality.
Timing: the moment the shutter fires; the moment motion begins or ends.
Linkage with editing: shot design that anticipates how the shot will connect to the next.

Most of these judgments are settled before the equipment is raised. That is, the “thinking” step is the core of shooting. The high-agency posture argued for in installment #1 is the capacity to bear that chain of judgments.

3.2 SNS-native composition — the futa.729s case

Shirai Futa (@futa.729s) is a video creator and SNS producer based in Osaka. With 382K Instagram and 917K TikTok followers, the practice centers on brand-story-style vertical short videos around 30 seconds. The stated concept: “More beautiful, memorable visual expression.”

3.2.1 Shirai Futa’s signature techniques

Shirai Futa’s compositional shooting has a distinct set of signature techniques. The representative ones:

Hand Power (the image responds when a hand reaches in): Syncs hand motion with camera motion so that the subject or scene appears to react to the hand. One hand operates phone or gimbal, while the other enters the frame for the effect.
Object Transition: A passport picked up triggers a scene change to Seoul; a book opens onto another world; a paper bag holds the spread of a season. The cut between time and space is staged through an object.
POV (first-person view): A circus performer’s perspective, the moment of bursting out of a school gate after class. The shot is immersed in the subject’s point of view. The result is an experience that registers directly with the brain.
Focal Length Comparison: The same scene shot at several focal lengths to show how the world the lens carves out shifts with the lens. A pedagogical kind of demonstration.
Light/Dark Switch: The instant the phone’s screen brightness is dropped, a fantastical night landscape rises — and similar maneuvers, where on-screen luminance changes flip the texture of the scene.

What these techniques share is a design where “the camera and the world outside the camera (the shooter’s body, props, the scene) move together.” Because the work cannot be completed by the camera alone, how well the composition is thought through before shooting — what to hold, how to move it, where to switch — decides whether it works.

3.2.2 Equipment workflow

The kit is consumer- to prosumer-grade, not pro-cinema gear.

Camera: iPhone, Sony bodies, DJI Osmo Pocket 3, iPhone 17, and similar
Stabilizer: Hohem iSteady V3
Tripod: when the situation calls for it
Editing: CapCut, Premiere Pro, After Effects, etc. (not explicitly stated, but standard for adjacent creators)

The equipment choice prioritizes mobility for shooting and optimization for the SNS vertical format. The freedom of movement you cannot get from a heavy cinema camera plus gimbal is obtained from lightweight gear instead.

3.2.3 Principles you can learn from the work

The principles you can extract from Shirai Futa’s work and use as teaching material:

Storyboards and prop design before the shoot: Object transitions are 80% decided by the storyboard and the props prepared. The shoot itself becomes a matter of executing the plan.
Shoot with the edit in mind: To make a transition land, you need to align the direction, speed, and color of motion across the surrounding shots. The habit of imagining the post-edit during the shoot is essential.
Body / camera unification: Using a gimbal unifies camera operation and bodily motion. There is a strong sports-training side to it.
Embrace the SNS vertical-format constraints: A 9:16 frame, hooking within the first second, telling the story within 30 seconds — these constraints become creative starting points rather than restrictions.

3.3 Narrative cinema composition — the Blake Ridder case

Blake Ridder is a UK-based filmmaker, writer, and actor. He has made more than 70 shorts and three features, with awards across multiple international festivals. He puts out tutorials on shooting, editing, color grading, and screenwriting via YouTube / TikTok / Instagram, and runs the Filmmaking Masterclass (masterclass.ridderfilms.com).

3.3.1 Blake Ridder’s approach

What Blake Ridder consistently argues is the position that judgment, not equipment, makes the picture. Paraphrased from the masterclass copy: good shooting is a matter of “control, intent, and emotion” — “making the audience feel something with composition, shadow, and color.”

Major topics that work as teaching material:

Three-layer framing: Foreground, leading lines, and rule of thirds — combined to produce depth and gaze guidance in the frame.
Turning day into night with color: Shooting outdoors in daytime and pushing it to night via contrast and color adjustment. Practical examples are presented with Sony FX2 + DaVinci Resolve.
Editing rhythm: Combining J-cut, L-cut, match shot, and cutting on motion, so that editing carries “emotion” rather than mere “precision.”
A cinematic look on a low budget: Expensive gear is not required; what is required is “using light, color, and motion with intent.” You can start on a phone.
AI-integrated workflow: At demo workshops including Cannes, Ridder shows pipelines that combine live action with AI generation — for example, replacing backgrounds with Kling AI rather than shooting against green screen.

3.3.2 Equipment list

The gear list Ridder publishes (on Amazon and elsewhere) covers a wide range, from mirrorless cameras to phones: SmallRig phone shooting kits, Sony FX2 / FX3, DaVinci Resolve Studio, Insta360 Flow 2 gimbals, and so on. The common thread is that equipment is positioned as a support for judgment, not a replacement for it.

3.3.3 Principles you can learn from the work

The principles you can extract from Ridder’s approach cover the side that the futa.729s case does not.

Intent is everything: Equipment and technique are subordinate to what you want the audience to feel. Technique without intent ends as decoration.
Designing contrast: Light/dark, color, motion, size — how contrast is constructed inside the frame is what carries the emotion.
Direction of performance and space: A cinematographer’s role is to unify light, color, and motion, and that role does not stand without trust from cast and crew.
The blurring of cinema / SNS boundaries: What used to be separate worlds — cinematic shooting and SNS-style shooting — are mixing as equipment and editing tools become more democratized. The Cannes demo workshops are a symbol of that mixing.

3.4 The cinematographic camerawork textbooks

The futa.729s and Blake Ridder cases are contrasting, but both are individual-author methodologies. Set against them, the camerawork theory accumulated inside the film industry is more systematic. In a course, it is important to teach these systems as “the vocabulary of camerawork.” Here is a summary of the major classifications.

3.4.1 Major patterns of camera movement

Across major teaching materials — StudioBinder, MasterClass, Backstage, and others — the camera movements commonly listed are:

Static Shot: Camera fixed on a tripod or similar; only the subject moves. The default for dialogue scenes. Works well with the Martin Scorsese way of letting actor improvisation breathe.
Pan: Swing the camera horizontally. Used to show the breadth of a scene, follow a subject, or reveal new information. The “Whip Pan (Swish Pan)” — a fast pan — is used by Paul Thomas Anderson, Damien Chazelle, and Quentin Tarantino to inject energy.
Tilt: Swing the camera vertically. Effective for conveying the height of a structure or a feeling of awe. Steven Spielberg’s use to introduce the dinosaurs in Jurassic Park is well known.
Push-in / Pull-out: Move the camera toward / away from the subject with a dolly or steadicam. The push-in expresses tension, intimacy, intrusion into the interior; the pull-out expresses isolation or the reveal of the whole picture. Coppola’s push-in in The Godfather and Kubrick’s pull-out in The Shining are the canonical examples.
Zoom: Change focal length. Because this is a motion that does not exist for the human eye, it tends to leave an unnatural impression. Kubrick’s use of it as a character falls into madness in Full Metal Jacket is representative.
Dolly Zoom (Vertigo Shot): Push in while zooming out (or vice versa). The subject’s size is preserved while the background perspective distorts. Originating in Hitchcock’s Vertigo and used in Lord of the Rings and many others to express fear or the loss of reality.
Roll (Dutch Angle): Rotate the camera around its optical axis. Expresses unease or instability. The slow roll on Killmonger’s coronation in Marvel’s Black Panther is cited as an example.
Tracking / Truck: Move the camera horizontally or longitudinally to follow the subject. Pairs well with long takes; Roger Deakins’s continuous Steadicam shots in 1917 are representative.
Arc: Camera motion that circles the subject. Provides tension and dynamism. Christopher Nolan uses it on the Joker in The Dark Knight.
Boom / Crane: Vertically lift / lower the camera on a crane or jib. Effective for scene-establishing shots.
Handheld: Held by the shooter. Yields rawness, immediacy, subjectivity. The documentary-style handheld in The Big Short is cited.
Bird’s Eye: A view from extremely high. Expresses the subject’s vulnerability or powerlessness, or the perspective of a god.

3.4.2 How camera movement maps to emotion

Each of these movements is linked to a specific emotion or effect. What matters is not “knowing the moves as techniques” but “understanding the correspondence between move and emotion.” The audience reads emotion from camera movement unconsciously. Move in a way that doesn’t match, and the audience gets confused.

Amplify tension / intimacy: push-in, slow zoom in
Express isolation / distance: pull-out, slow zoom out
Unease / instability: roll / Dutch Angle, handheld
Scale / awe: tilt-up, boom-up
Establishing the scene: pan, boom, establishing shot
Energy / speed: whip pan, tracking, handheld
God’s view / powerlessness: bird’s eye
Distortion of reality / madness: dolly zoom, extreme zoom

3.4.3 Shot sizes and angles

In addition to camera movement, shot size (how to crop the subject) and camera angle (where to put the camera relative to the subject) are part of the basic vocabulary of compositional shooting.

Shot sizes (widest first):

Extreme Wide Shot (EWS): an extremely wide shot showing the relationship of place and figure
Wide Shot (WS) / Long Shot (LS): the whole figure of a person fits
Medium Wide Shot (MWS): knees to head
Medium Shot (MS): waist to head
Medium Close-Up (MCU): chest to head
Close-Up (CU): shoulder to head, or face only
Extreme Close-Up (ECU): an extreme push — eyes only, mouth only, hands only

Angles:

Eye Level: same height as the subject’s eye line. Neutral.
High Angle: looks down on the subject. Emphasizes vulnerability.
Low Angle: looks up at the subject. Emphasizes strength / dominance.
Bird’s Eye: directly overhead
Worm’s Eye: directly below
Over-the-Shoulder (OTS): from over the shoulder. The default for dialogue scenes.
POV (Point of View): subjective view of a character.

This vocabulary, in total, is a toolkit for designing the relationship between the audience’s emotion and the subject. The standard order in a course is: memorize the vocabulary first, then watch examples to understand the correspondences, then use it in your own work.

3.5 Judgments that run through compositional shooting

Shirai Futa’s SNS-native composition, Blake Ridder’s narrative-cinematic composition, and cinematographic camerawork theory. The axes of judgment that run across all three can be organized as follows.

The order: intent → emotion → technique: Decide first what you want the audience to feel, then choose the technique that conveys that emotion. Starting from the technique produces a decorative picture.
Design that includes the edit: A piece is built from a sequence of shots, not from one shot. Each shot has meaning only in relation to the ones before and after it.
Equipment supports judgment: Equipment choice is the result of judgment, not its starting point. Whether to shoot on a phone, a mirrorless, or a cinema camera is decided by the emotion you want to express and the constraints of the site.
Designing contrast: Light / dark, color, motion, size, distance — all of them carry the picture forward by being placed in contrast.

Of these judgments, the part that says “what to make the audience feel” can be folded into a prompt for AI generation, but “what is happening at the site” and “the choice and operation of equipment” drop out. Shooting is the act that carries what AI generation cannot.

4. Special-purpose shooting — territory beyond ordinary vision

“Special-purpose shooting” is the umbrella term for shooting techniques that capture a world the naked eye cannot see. Macro / microscope captures the microscopic; motion control gives camera motion machine-grade precision; Tilt-Shift / Small Planet distorts perception; long exposure layers time. Each fixes a different territory onto the picture, and each is bound tightly to the physical characteristics of the equipment — which makes the textures hard to imitate with AI generation. This chapter walks through them in order.

4.1 Macro and microscopic shooting

Shooting that captures a world smaller than the eye can see has demand across nature documentaries, product photography, scientific imaging, and other fields. AI generation has trouble holding up depth-of-field, refraction of light, and reproducibility of detail at micro scales.

4.1.1 The stages of macro lenses

Macro shooting requires very different gear depending on the maximum magnification.

1:1 macro (life size): The image on the sensor is the same size as the real subject. Canon EF 100mm F2.8L, Sony FE 90mm F2.8 Macro G, Fujinon XF 80mm Macro, and most other major makers cover this standard range.
1:1 to 2:1: The image on the sensor is 1× to 2× the real subject. Laowa 60mm F2.8 2X Ultra Macro, and similar. The range where insect faces and flower detail begin to be visible.
2:1 to 5:1: Requires specialty lenses like Laowa 25mm F2.8 2.5-5X Ultra Macro or Canon MP-E 65mm F2.8 1-5X Macro Photo. Butterfly scales and plant cell structure come into view.
5:1 and above (microscope territory): Mount a microscope objective (4x, 10x, 20x, 40x, 100x) to the camera. Infinity Photo-Optical InfiniProbe TS-160, Mitutoyo Plan Apo, and so on. The level of microorganisms and crystal structure.

4.1.2 Probe lenses — an innovation in perspective

In recent years, the “probe lens” has dramatically expanded the range of macro work. The representative example is the Laowa 24mm F14 2X Macro Probe — a thin, rod-shaped barrel over 40 cm long with the lens at the tip.

References

Laowa 24mm F14 2X Macro Probe

Features of the probe lens:

20 mm-diameter ultra-thin tip: fits into narrow spaces that ordinary macro lenses (60 mm+ in diameter) cannot enter — inside a bottle, inside a cup, just above the ground.
Waterproof: can go inside liquids (aquariums, glasses).
Built-in LED ring light: the light source is at the tip, so lighting holds even in dark places.
Wide-angle macro perspective: ordinary macro lenses are medium telephoto to telephoto, but a 24 mm wide-angle macro produces “subject close, background still in view” — a distinct look.

In the BBC documentary Secrets of the Bees, around 60–70% of the shots use the Laowa Macro Probe series. The judgment behind it: maintaining the relationship between subject and environment while still in macro, in order to produce an “aha moment” for the audience.

4.1.3 Microscope territory — Nelsonian optics

At magnifications beyond 5×, extending an ordinary macro lens is difficult, and the optics of a microscope need to be borrowed.

Microscope objective + custom adapter: Mount an objective from AmScope, Mitutoyo, etc. to a mirrorless camera via a 3D-printed adapter or similar. Even an inexpensive Reakway objective (4×, around USD 25 / JPY 3,800) can produce usable images with the right stage and camera.
Infinity Photo-Optical InfiniProbe TS-160: A microscope-grade probe lens designed for cinema cameras. Uses Nelsonian optics (projection optics); combined with the Micro HM Objective, it covers up to 16× with working distances from 3 m down to 18 mm.
Focus stacking: At high magnifications, depth of field is extremely shallow (on the order of 10 to 100 microns), so you need to composite multiple shots at different focus positions. Zerene Stacker, Helicon Focus, and similar.
Focus stacking for video: For stills, the technique is established; for video, focus stacking assumes the subject is stationary. As a video-side equivalent, you can move focus at a constant rate using a motorized focus rail (e.g., NiSi NM-200) and then composite / stabilize.

4.1.4 The workflow of macro and microscopic shooting

The actual shooting flow varies with the scale of the subject.

1:1 to 2:1 range: Doable on an ordinary tripod with a macro lens. Composition is built by choosing shutter speed, aperture (the f/4–f/8 sweet spot), ISO, and focus plane.
2:1 to 5:1 range: Handheld is impractical. A macro slider (NiSi NM-200, Kirk macro focusing rail, etc.), a ring light (Amscope LED ring, Godox LED head, etc.), and a subject stage are needed.
Microscope territory: Completely stationary subject, precision stage, strong lighting, and a motorized rail for focus stacking — effectively a lab setup.

For teaching, a realistic scale-up is: start with 1:1 macro (a standard lens like the Sony FE 90mm Macro), then let students experience the 2:1 to 5:1 world via Laowa 25mm Ultra Macro and similar.

4.2 Motion control — automated camera motion via Edelkrone

Constant speed, constant trajectory, repeatability, long-duration operation — motion that the human hand cannot produce. Motion control gives those properties to the camera. There is strong demand in product shooting, time-lapse, VFX plate work, music videos, and similar areas.

4.2.1 Edelkrone’s motion-control ecosystem

The Turkish company edelkrone has developed an extensive line of compact, modular motion-control gear. The representative product lines:

References

edelkrone official site
SliderONE v3
SliderPLUS v6
HeadONE v2
SliderONE v3: An ultra-compact motorized slider that fits in a backpack. ~27 cm long, ~20 cm travel, horizontal payload 9 kg, vertical payload 2.3 kg. Runs on LP-E6 batteries. Position is set via app / on-body buttons / by hand.
SliderPLUS v6: A slider that achieves 2× the slide distance of its body length, thanks to the “Double Distance” design. Sized to support cinematic dolly-in / out.
HeadONE v2: An ultra-compact motorized pan-tilt head. Pan or tilt alone with one unit, pan + tilt with two stacked, or as a product-shoot turntable with the Turntable Module.
HeadPLUS: A higher-end head with pan / tilt / focus 3-axis control plus AI subject tracking.
DollyONE / DollyPLUS: Floor-running dollies, no rail required. Two types: tabletop and floor.

Combining several pieces of gear yields synchronized 3- to 5-axis motion. For example, a full configuration of SliderPLUS + HeadONE × 2 + DollyPLUS gives you a “robotic camera operator”: a dolly running on the floor, a slider moving on top of it, and a pan-tilt head holding the frame above that.

4.2.2 Point Tracking — automating the parallax effect

A distinctive feature of edelkrone is “Point Tracking.” You designate a point in space by aiming the camera at it from two directions; from then on, even as the slider / dolly moves, the HeadONE automatically follows, keeping the point centered in the frame. This enables:

Parallax: the subject stays at the center of the frame as the camera slides, and only the background flows past.
Product shooting: spin a product on a 360° turntable while also sliding the camera, to emphasize dimensionality.
VFX plates: the same move can be reproduced exactly, making multi-pass composites easy.

This tracking is computed via inverse kinematics, not image processing, so it does not break in low light or low contrast. It is the opposite approach to AI-based tracking.

4.2.3 Motion time-lapse

One of the canonical uses of Edelkrone gear is motion time-lapse.

Specify a start point and end point in the app.
Specify shot count (e.g., 240 frames) and total time (e.g., 30 minutes).
The slider performs the specified motion while firing the camera’s remote shutter.
The resulting frame sequence is turned into video in an editing tool (at 24 fps, that’s a 10-second clip).

You can record long durations of change — clouds drifting, stars rotating, a city transitioning from dusk to night — together with smooth camera motion. It is a closed-loop automated shooting flow on Edelkrone gear alone.

4.2.4 Other options

Edelkrone is the headline example, but motion control has other options. Course and budget shape the choice.

Rhino Camera Gear ROV / Slider Pro: Slider lines in a similar price band to Edelkrone.
Syrp Genie II: A pan-only electronic head. Relatively inexpensive; good for time-lapse beginners.
Kessler CineDrive: Cinema-grade motion control. Higher cost, but professional reliability.
Bolt by MRMC: High-end robotic-arm motion control. The industry standard for product and commercial work.
DIY: Build your own with Arduino, stepper motors, and a slider. As a teaching exercise, the learning value is high.

For a course, an Edelkrone SliderONE + HeadONE combination (around JPY 200,000 in total) is the most defensible choice — best balance of mobility, operability, and extensibility.

4.3 Tilt-Shift / Small Planet — distortion of perception

In ordinary photography and video, the world is captured as it is. There are techniques that, by distorting perception itself, make the world look like something else. Tilt-Shift and Small Planet are the representative examples. The look they produce is one that AI generation can imitate, but the shot-based texture remains distinct.

4.3.1 The principle of Tilt-Shift

A Tilt-Shift lens has a mechanism that tilts the optical axis (Tilt) or moves it parallel (Shift) relative to the image sensor. Representative gear: Canon TS-E 17mm / 24mm / 50mm / 90mm / 135mm, Nikon PC-E series, and so on.

References

StudioBinder: “Tilt-Shift Shot”

Tilt motion uses the Scheimpflug principle (the optical rule that the plane of focus, sensor plane, and lens plane, extended, meet at a single line). By tilting the lens, the normally parallel relationship between focus plane and sensor plane is broken, and the in-focus plane can be made diagonal or extremely shallow.

This enables expressions like:

Perspective correction for architecture: When shooting a tall building from below, the natural “narrowing toward the top” is corrected back to parallel via Shift.
Selective focus: Only a specific band of the frame is in focus; everything else is blurred. As a video expression, the gaze guidance is strong.
Miniature effect (Small Diorama): Making depth of field extremely shallow makes life-size scenery look like a miniature model. This is the most popular Tilt-Shift application.

4.3.2 Conditions for the miniature effect

The miniature effect doesn’t come from the Tilt-Shift lens alone — it depends on a combination of shooting conditions.

Shoot from above: Take an angle that looks down on the subject. The audience’s gaze aligns with the “look down on a miniature diorama” frame, and the effect lands more easily.
Subjects appear small: It looks more like a “scaled model” when the subject is placed small within the frame.
Subjects in motion: People, cars, and other moving things heighten the effect. The detail of their motion gets read as if seen through a macro lens.
Pair with time-lapse: Shooting time-lapse at 1–2 second intervals yields a sped-up miniature world. Keith Loutit (the creator of the Bathtub series and The Lion City series, widely regarded as a pioneer of the Tilt-Shift time-lapse miniaturization technique) and Sam O’Hare (The Sandpit, 2010, miniaturizing a day in New York) are representative practitioners.
Push saturation: Grading the color a bit strong gives a toy-like texture.

4.3.3 Tilt-Shift in post

Tilt-Shift lenses are expensive (JPY 100,000–300,000+), so ownership isn’t always assumed. Pseudo Tilt-Shift effects can be made in Adobe Premiere Pro, After Effects, Photoshop, and similar tools.

References

Adobe Premiere Pro official tutorial: Tilt-Shift Miniature Effect

A basic flow in Premiere Pro:

Import the source, create a new sequence.
Create a Color Matte and a black-and-white gradient mask (a thin shape transparent in the middle, opaque top and bottom).
Apply Gaussian blur through the mask so only top and bottom are blurred.
Push saturation, and use Posterize Time to simulate shutter speed (Time > Time Stretch for speed-up + frame-rate lock) to make it time-lapse-like.

The same workflow holds in After Effects, with finer mask animation control.

4.3.4 Small Planet (polar coordinates)

Small Planet is a technique that converts 360° panoramic photos / videos into a “globe-like small planet” landscape via polar coordinate transformation. It’s a separate lineage from Tilt-Shift, but it’s an adjacent technique in terms of distorting perception.

The workflow:

Shoot with a 360° camera (Insta360, GoPro Max, Ricoh Theta, etc.).
Import the resulting full-sphere panorama and apply a Polar Coordinates transformation (polar → rectangular).
A “planet” floats in the middle, with the sky expanded around it — a distinctive composition.
For video, the “planet” rotates / deforms in time with camera rotation or motion.

The CC Sphere effect in After Effects, the Polar Coordinates filter in Photoshop, Insta360 Studio, and similar tools, all implement this.

4.3.5 Relation to AI generation

For both Tilt-Shift and the miniature effect, AI image generators (Stable Diffusion, Midjourney, etc.) can produce similar-looking images with the “tilt-shift” or “miniature” prompt. Still, there remains a difference between the shot-based texture (a real city, real people, real light) and the generated texture (an averaged image drawn from the model’s distribution). Returning to this essay’s theme, this is “materiality” expressing itself.

4.4 Long exposure — the layering of time

Long exposure is a technique that folds an interval of time into a single image by leaving the shutter open far longer than usual. It allows expressions of time that the naked eye cannot see — light trails, star trails, smooth water surfaces, flowing clouds, the thinning out of crowds.

4.4.1 The basics of long exposure

References

Equipment and settings for long exposure:

Tripod: Mandatory. Handheld is practically impossible.
Remote shutter / timer: To avoid shake from pressing the shutter.
ND filter: For long exposures in bright environments, a 6- to 10-stop neutral density filter is needed. Hoya ProND, NiSi, B+W are the standard choices.
Manual exposure: Aperture-priority, or Bulb (arbitrary long exposure) mode.
Rule-of-thumb settings: 10 to 30 seconds for city light trails, several minutes to hours for star trails, several seconds to tens of seconds for smooth water.

4.4.2 Applications of light-trail shooting

Light trails are one of the most popular long-exposure applications. Moving light sources — car headlights and taillights, Ferris wheel illumination, fireworks, the trail of a pen light — remain on the picture as “lines of light.”

Representative variations:

Road light trails: From a viewpoint looking down on an elevated highway, capture the trails of car headlights (white) and taillights (red). A Canon example was shot at 13 seconds, f/9, ISO 100.
Ferris wheels / amusement parks: Rotating lights form circular or spiral trails.
Star trails: The rotational trails of stars around Polaris. Face north and run an exposure of several minutes to several hours. In astro-landscape photography, the “500 rule” (max exposure in seconds = 500 / focal length) is a rule of thumb for keeping stars as points; for star-trail shooting, you go beyond it with a longer single exposure, or alternatively stack many short exposures using software like StarStaX.
Pen light / light painting: Carry an LED in a dark environment and “draw” letters or shapes in space.

4.4.3 Long-exposure-style effects in video

Long exposure is a stills technique, but you can produce a similar effect in video.

References

PetaPixel: “Long Exposure Light Trails in Video with After Effects”

Drawing from Dan Marker-Moore’s tutorials, the following methods are available in After Effects:

Method A: layer duplication + Lighter Color: Import a video shot on a locked-off camera, duplicate the layer multiple times, and offset each by 1 to a few frames. Setting the blend mode to “Lighter Color” (composites the brighter pixel) leaves only the bright trails on the frame.
Method B: Echo Effect: Apply CC Echo or Echo Effect (After Effects) to the video layer or an adjustment layer; set mode to “Maximum” and choose an echo count. This also accumulates highlights.
Combined with motion time-lapse: Move the camera with Edelkrone or similar, and at each frame run a long exposure (interval shooting in bulb). The trails are then recorded while the camera itself is moving.

These methods enable a distinctive expression that has the texture of a stills-style long exposure while being video. A Toyota commercial piece (made by Dan Marker-Moore) is widely known for using this technique.

4.4.4 Long exposure and “the texture of time”

Pure as a technique, long exposure is the simple act of “leaving the shutter open longer.” The thinking behind it, however, is deep.

The opposite of photography that captures the instant: photography that folds time
A single image that condenses the total time the audience spent in a place
What moves disappears; what stays still remains
The bustle of a city becomes lines of light; the flow of people becomes a smoky band

These textures are, in principle, hard to produce with AI generation. AI generation samples video from the distribution of training data, but long exposure is “the integral of real physical time.” Even if the training data contains long-exposure examples, the “integral nature” of the technique is lost at generation time.

5. Time manipulation — slow motion and speed change

By operating on the time axis that shooting records, you can produce video that exceeds ordinary human perception. Slow motion stretches time; hyperlapse compresses it; speed ramping creates rhythm. This chapter takes these techniques from four angles — principle, equipment, SNS-native practice (aaa_tsushi), and post processing.

5.1 The theory of high-frame-rate shooting

Slow motion holds because the shooting frame rate is set higher than the playback frame rate. The playback frame rate is fixed (24 fps, 25 fps, 30 fps, 60 fps, etc.), so raising the shooting frame rate stretches time by the same factor.

References

5.1.1 The arithmetic of slow motion

The basic formula:

Slowdown ratio = shooting fps ÷ playback fps

Examples assuming 24 fps playback:

60 fps shooting → 2.5× slow
96 fps shooting → 4× slow
120 fps shooting → 5× slow
240 fps shooting → 10× slow
960 fps shooting → 40× slow (Phantom-class high-speed cameras)

You can run the calculation in the other direction too. To place a 5× slow shot in 8 seconds of cut, the real-time shooting length you need is 8 ÷ 5 = 1.6 seconds. Calculate this up front and you waste less and hesitate less on set.

5.1.2 The 180° shutter rule and the exposure penalty

High-frame-rate shooting comes with shutter-speed and exposure constraints.

180° shutter rule: For natural motion blur, shutter speed = 1 / (frame rate × 2) is the baseline.
- 24 fps → 1/48 s (1/50 in practice)
- 60 fps → 1/120 s
- 120 fps → 1/240 s
- 240 fps → 1/480 s
Exposure penalty: As shutter speed rises, the light required at the sensor rises with it. Approximate stops of extra light needed vs. 24 fps:
- 60 fps: about 1.3 stops more light
- 120 fps: about 2.3 stops
- 240 fps: about 3.3 stops
- 960 fps: about 5.3 stops
The ND-filter inversion: A daytime scene that would have needed an ND filter for ordinary shooting can become “not enough light” for high-frame-rate shooting. Indoors, additional lighting is practically a must beyond 120 fps.

5.1.3 The resolution tradeoff

On many cameras, raising the frame rate lowers the resolution. This comes from the physical readout limits of the sensor. Representative examples:

Canon EOS R5: 8K @ 24 fps, 4K @ 120 fps, Full HD @ 240 fps
Sony FX3: 4K @ 120 fps, Full HD @ 240 fps
RED V-RAPTOR XL: 8K @ 120 fps, 4K @ 240 fps, Full HD @ 600 fps
Panasonic GH7: 4K @ 120 fps, Full HD @ 300 fps

So the trifecta “high frame rate × high resolution × long duration” remains hard to satisfy at once. Pro gear has looser limits, but the limit still exists. Check that the required resolution and frame rate are simultaneously feasible before the shoot.

5.2 Equipment for slow motion

The speed range and image quality that slow-motion gear can reach vary widely. Since the options depend on the course and the budget, we organize them by tier.

5.2.1 Smartphones (120 to 240 fps)

iPhone, and recent Android flagships (Samsung Galaxy S25 Ultra, etc.), have a 120 fps to 240 fps “Slo-Mo” mode. Resolution is often capped at 1080p, but it is enough for entry- to intermediate-level slow-motion material.

iPhone Slo-Mo: 1080p @ 240 fps is standard (iPhone 8 and later). Pro models in some generations support 4K @ 120 fps.
Samsung Galaxy S25 Ultra: features 1080p @ 240 fps “Super Slow Motion.” Aaa Tsushi (a Team Galaxy ambassador, discussed later) shoots with this gear.
Pros: easy, no extra gear, easy to upload directly to SNS.
Cons: narrow dynamic range, weak in low light, limited editing latitude (Log shooting and similar).

5.2.2 Pocket cameras (DJI Osmo Pocket 3)

DJI Osmo Pocket 3 is a palm-sized camera with a 1-inch sensor and a built-in 3-axis gimbal.

4K @ 120 fps, 1080p @ 240 fps.
Low shake from the built-in gimbal; handheld stays stable for slow motion.
D-Log M support gives editing latitude.
Pros: gimbal + high speed at this price point.
Cons: sensor-size limits (night and low light feel underwhelming).

5.2.3 Mirrorless / cinema cameras (120 to 240 fps)

Serious high-frame-rate work calls for mirrorless or cinema cameras.

Sony FX3 / FX6: Cinema-line entry. 4K @ 120 fps, Full HD @ 240 fps. S-Log3, full-frame sensor with wide dynamic range. USD 3,000 to USD 6,000 (around JPY 450,000 to JPY 900,000).
Canon EOS R5 Mark II: 8K RAW @ 60 fps, 4K @ 120 fps, Full HD @ 240 fps. 10-bit internal recording. Current flagship released in 2024. Around USD 4,300 (around JPY 650,000). The first-generation R5 (2020) topped out at 120 fps for Full HD; 240 fps came with the Mark II.
Panasonic GH7: 5.8K @ 30 fps, 4K @ 120 fps, Full HD @ 240 fps (Full HD @ 300 fps in VFR mode). Internal V-Log / ProRes RAW recording despite Micro Four Thirds. From USD 2,200 (around JPY 330,000).
Blackmagic Pocket Cinema Camera 6K G2: 6K @ 50 fps, 4K @ 60 fps, 2.8K @ 120 fps, Full HD @ 120 fps (windowed). Blackmagic RAW support. Ships with a DaVinci Resolve Studio license. USD 1,995 (around JPY 300,000).
RED V-RAPTOR / V-RAPTOR XL: 8K (17:9) @ 120 fps, 4K (17:9) @ 240 fps, 2K (2.4:1) @ 600 fps. The representative REDCODE RAW camera. From USD 24,500 (around JPY 3,700,000).

5.2.4 Dedicated high-speed cameras (960 fps and beyond)

For “super slow motion” at 960 fps and above, dedicated cameras are required.

Phantom VEO 4K: 4K (4096 × 2160) @ 1000 fps, 2K @ ~2000 fps. Widely used for commercial and scientific shooting. From USD 50,000 (around JPY 7,500,000).
Phantom T3610: Capable beyond 1 Mfps (one million fps). Used in scientific applications like ballistics or impact testing.
Chronos 2.1-HD: A relatively affordable high-speed camera (USD 5,000s, around JPY 750,000). 1080p @ 1500 fps and above. Has deployments at educational institutions.

For a course, a Sony FX3-class body + a smartphone (iPhone, etc.) + (optionally) a DJI Osmo Pocket 3 covers the 240 fps range adequately. For 960 fps and beyond, a realistic option is a one-day rental as a hands-on workshop.

5.3 SNS-native smartphone slow motion — the aaa_tsushi case

Aaa Tsushi (@aaa_tsushi_) is a video creator based in Fukuoka (currently moving between Tokyo and Fukuoka). With 684K followers on Instagram, 3.3M on TikTok, and 137K on Threads, the work centers on “how to shoot photos and videos with a smartphone.” Affiliated with PPP STUDIO (an MCN for TikTok creators). Official Samsung “Team Galaxy” ambassador.

5.3.1 Background and the shape of the work

Aaa Tsushi worked at a regional bank straight out of school, while posting DSLR photos and drone footage on SNS as a hobby. After seeing the work, the CEO of a video-production startup headhunted him via Instagram DM, and he switched careers. He started TikTok in January 2019 and crossed 2.4M followers in just over six months.

His book iPhoneグラフィ: 日常の一瞬がドラマに変わる全撮影術 (KADOKAWA, 2021) — “iPhoneGraphy: How to Shoot Photos and Video That Turn the Everyday into Drama” — has, from the table of contents, the following structure:

Chapter 1: “iPhone features for shooting” you can use (photo and video).
Chapter 2: “Photo techniques” that turn the everyday into drama (composition, evocative portraits, how to shoot a plastic bottle).
Chapter 3: “Video techniques” you’ve never seen (mirror reflections, kicking the screen, and similar staging).
Chapter 4: “Easy retouching and editing” to lift the final quality.

The book’s tagline, “turning a moment of everyday into drama,” captures the core of the work succinctly: an approach that converts an offhand everyday moment into a dramatic video using only smartphone shooting and edit judgments.

5.3.2 Major techniques and recent practice

From recent SNS posts (2025 to 2026), here are the major techniques Aaa Tsushi uses.

AI generative transitions: Video that mixes original shot material with AI-generated transitions (e.g., posted 2025/11/11). As a TikTok PR campaign (#PR #connectbytourism #wakayama), the workflow crosses between live action and AI.
Hyperlapse: PR videos that use the Galaxy S25 Ultra hyperlapse feature. Time-lapse shot while walking, where the scene compresses along with motion — a distinctive expression.
Galaxy AI demos: Demos of features like Galaxy Z Fold7’s “remove everything you added yourself” (generative-AI subject removal), in collaborations with equipment makers that integrate the latest AI features into the work.
Live event shooting: High-quality shooting of a K-pop live event (ENHYPEN world tour) on Galaxy S25 Ultra. A showcase for the night / low-light performance of a phone — “you can shoot it cleanly even from the second balcony.”

5.3.3 Principles you can learn from the work

From Aaa Tsushi’s work, the educational principles for smartphone slow motion and time manipulation can be organized as follows.

Using the 240 fps modes of iPhone / Galaxy: Starting from the standard camera app’s Slo-Mo mode, pick subjects with motion (splashes of water, flowing hair, fabric motion, water poured from a plastic bottle) and shoot them.
Timing design: At 240 fps, one second of subject becomes ten seconds of video. Start the shot a little earlier and end it a little later to leave room for editing.
Composition’s interaction with slow motion: Slow motion emphasizes the “trace of motion.” Direction, speed, and endpoint of motion feed directly into compositional design.
Speed ramping in the edit: Use speed ramps — varying speed within a single shot — in CapCut, Premiere Pro, Final Cut Pro, and similar. Normal → slow → back to normal produces an emotional wave.
AI integration: In SNS, mixing shot material and AI-generated material seamlessly is becoming the baseline rather than the exception.

The Aaa Tsushi case embodies a three-layer structure for the SNS era: “smartphone + edit judgment + AI assistance.” It shows that slow motion can be established along a route different from the classical “cinema camera + tripod + RAW workflow.”

5.4 Time manipulation in post

If you couldn’t shoot in slow motion at capture time, or if you want to slow down an existing standard-speed clip, time manipulation moves to post. Premiere Pro and After Effects each offer multiple methods.

5.4.1 Premiere Pro’s three interpolation modes

References

Adobe Premiere Pro official: Time Interpolation Methods

In Premiere Pro’s “Speed/Duration” dialog, you can choose between three Time Interpolation methods.

Frame Sampling: Repeats / drops existing frames. The lightest computationally; stable on footage with little motion. The cost is that slowdowns retain a stuttery feel.
Frame Blending: Blends adjacent frames to synthesize intermediate frames. Produces blur, but is smoother than Frame Sampling.
Optical Flow: Analyzes pixel motion and generates intermediate frames. Yields the smoothest result, but is computationally heavy; complex motion (overlapping subjects, sudden movement, repeating patterns, etc.) produces artifacts (warping, ghosting, smear).

Practical guidelines:

Light slowdown 60 fps → 24 fps: Frame Sampling / Blending is enough.
24 fps → 8 fps (extreme slowdown): Even Optical Flow struggles. Reshoot if possible.
Static shots with little motion: Optical Flow gives a clean result.
Complex motion: Frame Blending, or split the picture with a mask and use a different method per region.

5.4.2 Time Remapping in After Effects

After Effects’ “Time Remap” gives finer control than Premiere Pro’s speed change.

Right-click a layer → Time → Enable Time Remapping.
Keyframe the time value shown on the timeline.
Specify “return to 0 seconds here,” “2× speed here,” and so on at arbitrary points.

This produces complex time control on a single clip — speed ramps, freeze frames (pauses), reverse playback, repeats. Enabling Frame Blending gives smoother results than frame sampling; Pixel Motion mode (the Optical Flow equivalent) smooths it further.

5.4.3 Twixtor — the industry-standard optical-flow tool

Twixtor from RE:Vision Effects is a plug-in dedicated to time interpolation, working in After Effects / Premiere Pro / DaVinci Resolve / Final Cut Pro. It supports retiming up to 160×, and is positioned in the industry as the standard “last resort.”

Twixtor’s features:

Pixel-level motion tracking (optical flow) and new-frame synthesis.
Far higher-quality slowdowns than the source allows (24 fps slowed below 1/5 has practical examples).
Mask + roto for separating foreground from background to suppress artifacts.
360° video support (motion tracking accounts for left / right and top / bottom continuity).
The Pro version adds manual tracking points, multiple matte specifications, and similar features.

Practical use:

Premiere Pro Optical Flow: The default for cases where light slowdown is enough.
After Effects Time Remap: For complex edits with multiple speed ranges.
Twixtor: For extreme slowdowns or when top quality is required.

5.4.4 Alternatives and successors to Twixtor

Twixtor has been the industry standard for years, but alternatives and successors have grown in recent times.

Boris FX Continuum “BCC Optical Flow”: The Continuum-suite optical flow. Many presets; easy to handle.
Sapphire “S_Retime”: Included in the Sapphire suite. Pairs well with other VFX.
SpeedX: AI-driven intermediate-frame generation. A new plug-in supporting After Effects / Premiere Pro.
Topaz Video AI: An AI-based video enhancement tool. Bundles frame interpolation, resolution upscaling, and denoising.
Adobe-side AI integration: Backed by Adobe’s Firefly / Sensei, AI-based time interpolation is expected to be more deeply integrated into Premiere Pro / After Effects over time.

5.4.5 Judgments to make at capture time

You can manipulate time in post, but capturing at the right frame rate is always higher quality. As teaching material, the recommended order is: first, let students experience “slow motion that assumes high-frame-rate shooting”; then teach interpolation as a fallback — “you can still slow it down in post even at standard frame rate.”

5.5 What it means to edit time

Slow motion, hyperlapse, speed ramping. These techniques all show the world at a time scale beyond ordinary human perception. In the age of AI generation, what significance do these techniques carry?

AI generation does not guarantee physical consistency on the time axis: As touched on in Chapter 2, AI-generated video is weak on physical-law consistency. In slow motion, that consistency is the very thing on display. Water droplets falling, hair flowing, fabric moving — stretching motion that follows physics lets the audience see an order they normally cannot. In AI-generated slow motion, that order tends to break.
Time manipulation connects with the judgments of shooting: Whether to use slow motion, which moment to stretch, where to return to normal speed — these judgments are extensions of compositional shooting. The “intent → emotion → technique” chain from Chapter 3 applies on the time axis too.
Dissolving the SNS-native / cinema boundary: As the Aaa Tsushi case shows, even a phone + AI assistance can capture 240 fps footage and manipulate time in the edit. The boundary between cinematic time manipulation and SNS-style time manipulation is thinning thanks to the democratization of equipment. Course design should assume that thinning and teach both seamlessly.

6. The significance of shooting in the AI era

The compositional shooting, special-purpose shooting, and time manipulation covered in Chapters 3 through 5 are each their own lineage. But there is a common logic running through them: “producing textures in territory that AI generation cannot reach in principle.” This chapter sums up that shared logic, and the figure of the shooter in the AI era.

6.1 The common logic across the three lineages

Compositional shooting (Chapter 3) writes the shooter’s “judgment” into the picture. Framing, shot size, camera motion, shot design that anticipates the edit — it is judgment after judgment.

Special-purpose shooting (Chapter 4) writes the equipment’s “materiality” into the picture. The optical limits of a macro lens, the motor precision of Edelkrone, the lens mechanism of Tilt-Shift, the sensor integration of long exposure — all of them depend on the characteristics of physical equipment.

Time manipulation (Chapter 5) writes the “arithmetic” of the time axis into the picture. The relationships among frame rate, shutter speed, and shutter angle are mathematically fixed, and the picture is the result of those relationships combined with the equipment and the light at the site.

None of “judgment,” “materiality,” and “arithmetic” is straightforward to reproduce in AI generation. AI generation samples video from an averaged distribution extracted from training data, and so it cannot directly express individual judgment, physics, or arithmetic. Shooting is the act that inscribes those three axes onto the picture directly.

6.2 The possibility of collaboration with AI generation

Shooting and AI generation are not in opposition. In recent years, workflows that combine them have developed rapidly.

i2v (image-to-video): AI generates motion from a captured still as a starting point. The shooter’s compositional judgment is combined with AI motion generation.
v2v (video-to-video): Captured footage is style-transferred or edited. Shot motion mixes with AI-side texture.
Background replacement: Background replacement with tools like the Kling AI Aaa Tsushi mentions captures the subject live while generating the background — spreading rapidly as a substitute for green-screen shooting.
Generative transitions: Insert AI-generated transitions between live-action clips to reinforce scene connections that would be hard to shoot.
Cinema × AI: As Blake Ridder’s Cannes demo workshop shows, cinematic shooting and AI generation are spreading as a new combination for producing a cinematic picture on a low budget.

These integrations require a design that uses both at different stages of the pipeline, not an either / or choice. The shooter increasingly needs to make capture-time judgments while aware of which stage their shot will combine with AI later.

6.3 The shooter as an AI-era figure

Applying the “high-agency artist” concept from Andrew Price (covered in installment #1) to the shooter yields a figure like the following.

Can decide on equipment: phone vs. mirrorless vs. cinema camera; can explain why a given choice.
Can articulate intent within the frame: what to show, what not to show, and why.
Can plan for what comes after shooting: captures material with editing, color, and AI integration in mind.
Strong sense for the physical world: alert to the on-site light, air, time of day, and movement of people around the subject.
Can collaborate with AI generation: designs how to feed shot material into AI generation and how to integrate AI-generated material with shot footage.

This figure of the shooter calls for both classical cinematography training (light, composition, edit, color) and the new SNS / AI-era sensibility (vertical format, integration with generative AI, immediacy). Course design accordingly needs a structure that teaches both seamlessly.

6.4 Why shooting persists

To condense the whole essay’s argument into one statement:

However far AI generation develops, shooting will not vanish. The reason is that shooting is the act itself of “raising a camera at a real site,” and that act has expressive substance of its own. Shots that cannot be captured without the shooter’s body on site, shots inscribed with the materiality of equipment, one-time-only shots where subject, light, and time happened to intersect — these cannot, in principle, be made by AI generation.

This is the significance of treating shooting in a curriculum. Teach only AI generation, and you raise “students who can write prompts.” Teach only shooting, and you raise “students who cannot use the contemporary tools.” Teach both, and raise students with the judgment to combine the two — that is the direction this series points toward.

Shooting is the practice of judgment in the AI era, and it is the place where that judgment becomes visible.

7. Conclusion

This essay surveyed shooting practice across three lineages: compositional shooting, special-purpose shooting, and time manipulation. The SNS-native shooting of Shirai Futa (@futa.729s), the narrative-cinematic shooting of Blake Ridder, the theory of cinematographic camerawork, motion control with Edelkrone, macro and microscopic shooting, Tilt-Shift / Small Planet, long exposure and light trails, smartphone slow motion by Aaa Tsushi (@aaa_tsushi_), time manipulation in Premiere / After Effects — these were connected through the lens of capturing judgment and materiality in the AI era.

The framing introduced in installment #1 from Andrew Price — “what remains in the AI era is judgment” — applies consistently to shooting practice as well. Shooting is a chain of judgments; the materiality of equipment and the embodiment of the shooter are inscribed in the picture. Rather than opposing AI generation, shooting becomes more distinct, and so does AI generation, when the two are combined.

Shooting, AI generation, media programming. This series has so far worked through each of these — three families of moving-image production that stand on different principles — in turn. The next step is the stage of curriculum design: how to combine them and teach the combination to students.

Media Creation Curriculum in the AI Era #5 — Implementing Physical-Space Staging with TouchDesigner — Using the TMU Production Studio as a Case Study