I wrote this in 1988. Some of it seems a bit dated now but I have resisted the temptation to re-write it or try to bring it up to date. What I wanted to do was to articulate an 'atomic' approach to editing theory by looking at what editors actually do when they are editing.
The Ideal Editor: Notes Towards a Theory of Editing
Part One—The Ideal System
Over the last twenty years I have often heard both film editors and videotape editors affirming that their medium is “better” than the other. I have always found such assertions rather meaningless (although I confess that my prejudices were, rather naturally, in favour of film).
Over the last four or five years I have had the privilege of talking to many people from VT about the nature of film editing. I have also tried, in my talks, to take fresh look at the old question of whether film or tape is a better editing medium. To this I end I tried to develop a more rational and objective way of comparing the two.
I believe that in doing so I have made it possible to take a fresh look at the strengths and weaknesses of both media, and to look ahead to new developments which will supercede them both. I know that in the course of this I have certainly learned some surprising new things about film editing which had not really occurred to me before. It is this approach which I wish to present in this paper.
Basically, my approach is to lay down a design specification for an ideal editing system which might be applicable to any editing operation. In fact many of the concepts are drawn from the design philosophies of a number of computer editors, but I hope that they are of sufficient generality to apply widely.
In this paper I am not only going to adopt an ‘ideal’ approach, but also one which some might characterize as ‘idealistic’. This is because the perspective will be editor-centred, rather than manager-centred.
Hence, criteria such as cost of installation, and length of useful life will not be of primary importance. Reliability, though, would be more important because it impinges directly on the editor’s ability to work at his or her own pace. (In recognition of the fact that editing—film editing at least—has always been a craft pursued by both men and women with equal distinction, I shall use masculine and feminine pronouns indiscriminately.) The justification for this approach seems to me to lie in the concept of ‘de-skilling’, used sometimes to characterize the introduction of some forms of new technology. When de-skilling occurs, the operator is obliged to fit in with the machinery, rather than vice-versa.
By focusing on the needs of the editor, I hope to be able to expose some of the compromises and inadequacies inherent in current film and videotape editing systems, and to point the way towards greater flexibility for the future. In this way new technology may prove to be enabling, rather than disabling.
With the preceding proviso in mind, I present the following as a list of essential requirements for an editing system.
Having started with high promises of objectivity, it may seem ironical that I immediately back-pedal, and make the first element on my ‘wish list’ a concept as subjective as ‘inter activity’. Yet it is crucial to any editing operation.
A good editing system will allow the editor to experiment; it gives her the opportunity to make tentative judgements, try them out, and then amend them, with the minimum of effort. In computer jargon, a good user interface is essential if the editor is to work efficiently.
The ideal editing system is like an organic extension to the editor herself. Anything which interferes with this harmonious relationship should be considered as a design flaw. Much of the specification which follows is intended as an attempt to define some of the features necessary to achieve true interactivity.
Incidentally the term ‘editor’ should properly be applied to the person who uses an editing system, and not to the system itself (not-withstanding the usage in amateur movie making which once led to the delightfully evocative headline, “Motorize Your Editor for £4.50”!).
Define a block
Any editing system must be fitted to the medium in which it operates. In particular it must be able to operate upon the ‘atomic’ elements of that medium. For instance, a word processor which did not allow access to individual letters would not be very useful (especially if your typing is as bad as mine.
This may seem a trivial point, but in fact, it might have some quite far-reaching implications as we shall see later.
My third point is “instantaneous access to any block”, but it is important to note that I am asking for psychological instantaneity rather than some specific data fetch time. The important thing is that the system works at the editor’s pace, never requiring him to wait when he wants to work.
For instance, there is no requirement that any word processor should be able to input more than 300 words per minute from a keyboard (and considerably fewer when I type), but it is still possible to find programs which cannot keep up with even a moderate typist. These are very difficult to use because they interrupt the ‘flow’ which is so important to a writer.
In fact, though, what is required is more than simple access. All editing operations should be carried out as fast as possible. Again, the reason for this is simply the editor’s need to keep the flow going. Editing is a process which takes place in the head as well as on the editing system; ideally, the physical editing should take place at the same pace as the mental operations.
After the essential preliminaries, we come to the core of the editing system, the block operations. These operations are fundamental to the editing process; without them (or their simulated equivalents), editing will be a crude and stilted affair.
Editing involves the selection and arrangement of elements from a field containing many elements. Sometimes the editing task involves all of these elements, but at other times the ‘work in progress’, only concerns a limited number of elements, selected from the wider field. As editing progresses, elements may be transferred from the wider field to the work in progress or vice versa.
Source and target
All of the block operations are concerned with the movement of blocks: new blocks may be imported into the work in progress, existing blocks may be removed from the work, or blocks may be combined with one another. When considering a block operation it is necessary to distinguish between the source block, the target block and the resulting block. The source is the block before the block operation has taken place, the target is the block which will be operated on. The result will depend on the nature of the block operation chosen.
As far as the source is concerned, there are two possibilities: it may be moved, or copied. A ‘copy’ operation leaves the original material intact in its original position and makes the source block a duplicate.
There are six fundamental editing operations, which correspond to the five ways in which the source block may be combined with the work: append, replace, insert, delete, and merge, together with the transformation of the source itself..
1) Append. This is the simplest of all the editing operations. A new block is simply added to the ‘end’ of the work in progress, This should take place without affecting the rest of the work. (In general, the only time an append operation causes difficulty is when there is not enough room for the source block.)
2) Replace. The simple append operation is not sufficient to provide any kind of sophisticated editing. It is often desirable to place the source block at a target location ‘within’ the work in progress. In the replace operation, the source obliterates the target material which was previously present at that location. It should not affect any other material in the system.
3) Insert. Although replace can be useful, it is more common for the editor to wish to insert the source block without destroying the material already present. The insert operation allows him to do this. The editing system should permit a block of any size to be inserted at any point in the work being edited. The block may come from within the work in progress, or be imported from outside. The insertion should not have any adverse effects on the material already present. Furthermore, the editing system should take care of any consequences of the insertion.
To continue with the example of the word processor, I should be able to insert a block of several standard paragraphs in a form letter or a single letter in the middle of the word “editr”. And if I do insert the letter “o”, it must not overwrite the “r” and this implies that all the characters after the “t” must be moved down towards the end of the document in order to make way for the “o”. I do not expect to have to do this myself, but rather, I rely on the editing system to do it for me—and to do it ‘instantaneously’.
4) Delete. The ability to remove blocks from the work in progress is just as important as being able to add them. The editing system should permit a block of any size to be deleted from any part of the work being edited. This should not have any adverse effects on any other material in the work. Furthermore, the editing system should take care of any consequences of the insertion.
If I delete a letter from “edditor”, I do not expect any of the other letters to be affected. I also expect the gap left by the removal of the extra “d” to be closed up by the editing system. I should not have to do it myself.
5) Merge. The fifth of the fundamental operations is the merge. In this, the source block is combined with the target block already existing in the work. Depending on the nature of the blocks and the editing medium, there may be several different ways in which this combination can take place.
6) Transform. Finally, the editing system may permit a transformation of the target block (which is also the source block). An example might be when a word is changed from one font to another—bold to italic, say.
Not all editing systems permit all of these operations. This may be because they fall short of the ideal, and would be improved by moving closer to it, but it may also be because they are not all necessary. For instance, it is hard to think of a use for a ‘merge’ operation in a word processor. Would merging “film” with “edit” produce “efdiiltm”—and if so, what for?
On the other hand, a graphics editor which allows me to create and alter pictures on my computer will find the merge operation very important. Indeed, it offers me several different ways of merging two graphic images so that I can produce a wide range of effects.
Many editing systems allow the manipulation of different kinds of material. An example of such a system is the fashionable desktop publishing program. Most DTP programs allow the editor the chance to edit both words and images. I use the clumsy term multi-medium independence to describe the property of being able to edit one kind of material without having to make any changes to any other kind. So, for instance, I should be able to edit the text in a document without this altering the graphics in any way, and vice versa.
The editor has to interact with the editing system through what is known in computer jargon as the user interface. A great deal of work has been done on the design of interfaces for computer programs, but I do not intend to cover any of that ground here. For the present all I want to do is to stress the supreme importance of this category, and to suggest that it is vital that we do not give in to complacency or tunnel vision.
For instance, a word processor which gives the user the option of using a mouse as well as a keyboard may offer a better user interface than one which uses a keyboard alone. But this should not blind us to the fact that a voice-actuated word processor may be significantly more flexible than any so far existing—or that the ideal interface may be one which allows the option of direct control by the mind.
Retrieval of intermediate stages
Editing is a process which does not always progress in an orderly or linear fashion. It is often desirable to backtrack to a previous stage. One editing system I use (it allows me to edit type faces) has a ten-stage ‘undo’ facility which enable me to cycle backwards and forwards through the last ten editing operations I have performed.
All computer-based editors permit the work to be ‘saved’ in a permanent form. As a consequence of this, it is possible to save as many intermediate forms of the work in progress as is desired. However, this is not normally a specific feature of the editing system, although many will automatically save the previous version as well as the current one.
No quality loss, reliability
These are both obvious criteria. When manipulating the medium, we do not want it altered in a random or non-controllable way. There should be no ‘noise’ in the system.
Reliability is also important because it can directly affect the editing process. By its very nature, a breakdown will normally only manifest itself while editing is in progress, and so it is bound to interrupt the creative flow. In extreme cases, such as a computer system crash, it may cause the loss of already completed work (indeed, the previous version of this very paragraph was lost when my word processor crashed before I had a chance to save my work to disk!).
Ancillary support systems
I have detailed a number of possibly useful ancillary support systems which a good editing system might provide. Strictly speaking, these are not concerned with editing, but they can make the job much easier.
Search and/or replace. If there is a large amount of material, it can be a major task to keep track of it. Some kind of indexing becomes necessary. If the editing system can provide automatic search facilities, this can speed the editing process, and thus avoiding one more frustrating which might impede the editor’s creative flow.
Macros. Some editing jobs require the editor to perform the same sequence of operations several times. In such a case, a ‘macro’ facility which will allow this repetition to be automatically performed may be helpful.
Libraries. There are times when the work in progress needs material to come from outside the general work field. It is helpful if the editing system can maintain the libraries and arrange for the introduction of new material. If it does not do so, then the strength of the links between library and editing system become important.
Help. The more complex the editing system, the more likely the user is to forget how to use some of its more obscure functions. In such a situation the system should provide assistance—preferably in what is known as ‘context-sensitive on-line help’. In other words, when I press the ‘Help’ button (assuming such a thing exists) the system should immediately respond with helpful information about the task I am currently undertaking.
For complex assistance, it is common for off-line help to be available (normally in the form of an instruction manual), but with the development of CD-ROM systems it will soon be possible to have on-line complex help integrated with the editing system.
Error checking. It has been suggested that even editors make mistakes occasionally! As artificial intelligence techniques develop, it may be possible for the editing system to operate some sort of error checking. Perhaps the best known example of this at present is the spelling checker in a word processor.
This concludes my specification for an ideal editing system. Doubtless it can be criticised and further refined, but I believe that it does provide a valuable starting point for discussion. In part two I will use it to compare and contrast film and videotape as editing systems.
Part Two—Comparison Between Film & Videotape
Before I move to the first major practical point of this essay, which is to provide a rational comparison between film and videotape as editing media, I must stress that there is still a great deal of subjectivity involved. In particular, I feel that my assessments of both media are likely to be too favourable, and that this is especially true of my assessment of film.
I have observed, in both film and VT editors, an overwhelming tendency to be satisfied with the facilities provided by the system they use. In particular, I have discovered this in myself and it is only by performing the exercise described in these pages that I have begun to be able to make objective judgements about film editing.
There was a time when I would have characterized film as being quite close to an ideal editing medium; I now recognise that as being very far from the truth.
Nevertheless, I present my own biased assessments of the differences between film and VT in order to provoke thought, rather than controversy. It is the principle which counts, rather than the exact details of the scoring. Table two repeats table one, but compares both film and VT editing systems with the ideal editing system. My detailed comments follow (in the case of both film and VT I have assumed ‘off-line’ operation).
I have suggested that film has a greater degree of interactivity than VT. I acknowledged earlier the subjectivity of any judgements about interactivity, and am not prepared to defend my scoring with any degree of seriousness here. My subsequent comments may help to explain why I come to this conclusion. What is more important is the realisation that both film and VT fall a long way short of the ideal.
Define a block
The smallest block is the individual frame, the largest is the whole of the work. Film obviously has no problems in this area. VT editing in component form is the same. PAL editing may give rise to difficulties in accessing an exact frame because of the PAL sequence, but this can be got around if necessary. It therefore seemed appropriate to give a ‘yes’ to both media for this category.
It may be remembered that I said that what really counts is ‘psychological instantaneity’—I want it when I want it. I have therefore been quite happy to rely on subjective feelings in assessing the two media. I do not know whether it is ‘really’ quicker to get access to material on film, but I do know that it feels quicker to me!
In part this reflects current editing practices, rather than simply being inherent in the media. It is common practice, before starting to edit a film, to break the material down into discrete elements. In the case of a drama, this will normally be individual selected takes. In the case of a documentary, the elements may be longer, but will consist of a few shots—all related to one sequence. This means that once the editing process has started access to the required material is likely to be swift.
Furthermore, the nature of the equipment used, especially the PicSync, further encourages the rapid viewing and reviewing of a number of shots. In theory this could also be possible with offline editing: the rushes could be copied off onto separate cassettes in a process entirely analogous to film’s ‘breaking down’. In practice, I do not believe that this is done at present.
There is a further, purely psychological, factor at work. When I look for a shot while cutting film, I am active and able to remain engaged in the editing process even though it is temporarily halted. When I look for a shot when cutting tape, all I can do is wait. My personal experience is that this is likely to frustrate me, and alienate me from the editing process. Whether this is either inevitable or universal I am unable to say.
Source & target
A major difference between film and VT becomes apparent here. In film the source is moved to the target; in tape it is copied. If all other things are equal (which they are not in this case), the copy operation is more useful than the move, since it permits a wider range of options.
Both film and tape can append material quite satisfactorily. Of course, if they could not, they would be little use for the task, in hand. However, append is of limited use in the later stages of editing. It is essentially concerned with the assembly of the work material from the wider field (rushes, library shots, rostrum, etc). Once an assembly has been made, and new versions are required (even if differing by only one frame from the previous version), append is not usually an appropriate operation.
Replace is essentially a copy operation, since the source is copied ‘on top of’ existing material. It is of limited use since it requires the source and target to be of the same size. Typically, it is most useful for an operation such as dropping cutaways into a master shot. Equally typically, this tends to reveal its weaknesses, since cutaways rarely match a master exactly, and really require the subsequent alteration of cutting points within the master to shot so that, for instance, hand movements match.
It is not possible to perform a replace operation in film, although it is quite possible to simulate it by using an insert and then a delete (or vice versa).
Insert & delete
Despite the fact that the term ‘insert editing’ is sometimes used when editing VT, neither the insert nor delete operations can be performed on tape. This is a major disadvantage because once the assembly stage has been reached, it is these two operations which are by far the most useful when editing.
In fact, for this reason alone it is possible to confidently assert the ‘superiority’ of film as an editing medium. Of course, VT can simulate insert and delete by using copying (strictly, replacing blank tape with the previous version of the program) and appending. But anyone who has tried to insert a few frames 30 minutes into a forty minute program will appreciate that this is a far from satisfactory solution.
In my experience, the major strategy adopted by tape editors to handle this problem is simply to try to avoid it—in other words, a prime example of ‘de-skilling’; letting the system dictate the technique.
Just as insert and delete belong to film alone, so merge belongs solely to tape. It is true that the dissolve and fade can be indicated on film, and executed at answer print stage. It is also true that some marvellous merge effects can be achieved by optical houses. But by the time these effects are observed it is either too late or too expensive to change them. The primary criterion of interactivity is completely absent.
In my experience, the major strategy adopted by film editors to handle this problem is simply to try to avoid it—in other words, yet another prime example of ‘de-skilling’; letting the system dictate the technique.
(Incidentally, the merge operation is frequently performed by the VT editor in respect of the sound in a program. Film editors never mix sound, leaving that part of the process to the dubbing mixer.)
Until I started to analyse the editing process in depth, I too was almost completely blind to the disadvantages inherent in film’s inability to interactively merge, or to be able to copy a source (making interactive use of effects such as slow motion impossible). I simply accepted this as ‘one of those things’ and tried to find an alternative way of solving the problem or expressing the program content. I will have more to say on this topic later.
Malcolm Maclaren of the Film Board of Canada used to draw directly onto celluloid and was thus able to transform it. In normal practice the film editor has no way to transform the image or sound interactively. The VT editor can do so, though the extent to which this can be done will depend on the range of ancillary equipment available—such as didgital effects boxes or sound mixers and equalisers.
There are two media involved in the production of most programmes: sound and picture. Sepmag film gives the chance to edit each independently. Videotape is a ‘combined’ system, and does not offer quite such a degree of flexibility. Most of the time the problems can be solved quite satisfactorily, but there are occasions which seem to defeat VT.
In particular, I am reminded of a documentary I edited on High Band U-Matic some years ago. I tried to treat this in as film-like way as I could. All went reasonably well until I came to lay the commentary. In my opinion, the ideal way to add commentary is to get the programme to an almost finished state, and then record the commentary to the picture so that the commentator can get the feel of the existing picture and sound rhythms. Then the commentary track is laid to the picture.
This is a complex process because the aim is to get the best balance of three different rhythms: that of the picture (which is itself complex, being the result of the interaction between the internal rhythms of the individual shots and the rhythm of the pace of the cuts between those shots), that of the program sound (which may also be complex), and the rhythms of the commentary words and the delivery of the commentator (one of the principle aims of the commentary recording sessions is to achieve some measure of consonance between words and delivery).
In order to achieve the optimum balance, the editor needs to have maximum flexibility. In practice, this means that she must be able to alter the spacing between the words (lengthening or tightening pauses between phrases, normally) so that the commentary fits the program, and also make subtle picture changes to accommodate the commentary. It is a kind of ‘both ends towards the middle’ strategy, and is very difficult to perform even on a six-plate Steenbeck.
When I tried it using two U-Matic players and one recorder, I nearly wept with frustration. The major problem was the lack of interactivity, due mainly to the fact that I couldn’t insert or delete, but also to the impossibility of laying down the commentary and then deciding to move it back two frames—without making any alterations to the basic programme.
Some videotape editors find film’s ‘hands on’ approach to editing very difficult, perhaps often because of film’s basic intransigence, and the time it takes to master it. Some film editors find videotape’s ‘control panel’ approach to editing alienating, often because they are unused to anything which smacks of computers or ‘machines’. Because of this, and also because I wish to return to the subject in part three, I have refrained from scoring this category.
Nevertheless it is my impression that the film interface is more direct and allows a faster and more natural interaction with the material. This is probably because I believe that input is more natural when achieved in an analogue, rather than digital, mode. Film editing comes closer to this.
Because film uses the ‘move’ operation exclusively, any change to a sequence inevitably destroys the existing version. There is therefore no easy way to go back to previous states. There are two basic editing situations in which such an ability would be useful.
Firstly, although we are always looking for the ‘perfect’ cut, which can be recognised as soon as it is seen, in practice we often have to settle for the one which is least offensive. In such a case, the editor tries an option, decides that it is not perfect, trims it by a frame, say, and then decides that the previous state was better. It would improve both efficiency and interactivity if he could immediately recall his previous attempt instead of having to remake it from scratch. The ‘preview’ facility offered by VT edit controllers goes some way towards offering this kind of facility. It is certainly superior to film in this respect.
Secondly, it is my experience that in the cutting of a complex documentary the structure of the programme often deviates significantly from the first assembly during the middle stages of the schedule, but has a tendency to move back towards first thoughts as the programme nears completion. In such a case it is often necessary to try to reconstruct an earlier version of a sequence.
Until I edited a documentary on videotape, I was not really aware of how much I missed such facilities. The ability to keep previous versions of a sequence, as permitted by VT, can be a very useful facility, and one which I now miss in the film cutting room.
No quality loss
Since both film and V IT can be operated as off-line systems, quality loss is not really a problem for either of them. There may be legitimate debate about the relative merits about the recorded images, but this is a function of the media themselves, not the editing systems entailed by them.
My figures are a guess, based on my apprehension that both systems are fairly reliable and that the relative simplicity of film tends to make it more so than tape. Although no breakdown is acceptable, the interruptions are not so frequent as to seriously interfere with the editing process.
I consider that neither film nor VT have any support systems of the kind I envisaged earlier. The closest might be the Film & Videotape Library, but since there is no way to use this interactively at present, this also does not qualify.
When I started to develop the ideas presented here, my aim was to demonstrate objectively that film is superior to videotape as an editing medium. I believe that I have done so, largely because of the overwhelming importance of the insert and delete operations in enabling fast, efficient, and interactive alterations to an existing structure.
However, in the course of the exercise, I also became aware of a large number of deficiencies inherent in film (essentially because of its inability to perform interactive copy operations), which had previously been hidden from me. I began to see that it is becoming increasingly important to look beyond the present limitations of both systems and try to chart a path into the future. In the final part of this paper, I will be doing just that.
But first I would like to give a specific example, from my own experience, which might flesh out some of the bones given above and point the way forward for further discussions of the way editing technique interacts with, and is conditioned by, the characteristics of the editing system.
Assembling a sequence
Most of my work as an editor has been concerned with documentary programmes. The assembly of such a programme tends to proceed in discrete stages, as individual sequences are put together. A ‘sequence’ is a group of at least three shots, and therefore requires at least two ‘cuts’ (film terminology) or ‘edits’ (VT terminology) to fashion it. Most sequences have more shots, typically between five and twenty.
When I assemble a sequence, I interact with :he editing system in a number of ways. Firstly, I view the material and make tentative editing decisions as I go. If there is any degree of complexity in the sequence, these initial decisions will soon turn out to be untenable. I will then need to review specific shots to try to work out alternative methods of ordering the shots and the action within them.
In order to do this I use the PicSync, which permits me to lace and unlace individual shots very quickly, so that I can check my mental cutting of the sequence. When I am happy that I have a tenable basic structure for the sequence, I will then make the physical joins and wind it onto the assembly roll. In practice, I usually do not bother to look at what I have done, since I join on the right-hand side of the PicSync (that is, after the film has gone through the picture gate).
The first time I will normally view a sequence is when the director comes in for the assembly viewing. The point is that at this stage I am concerned with structure, not with the minutiae of pacing and continuity. If some of the cuts are not smooth or timely, I do not worry—it is the overall ‘feel’ which counts.
There is another aspect of my working practice which is relevant here. Some sequences are fast and exciting, others are slow and melancholy; the pace is dictated both by the intrinsic nature of the material and also by its structural position in the film as a whole. I have observed that I actually work faster than normal when I am cutting a fast sequence, and that I slow down when cutting a slow one. I don’t know whether this is an essential part of my technique or a subconscious affectation, but it feels important to me.
My difficulty with videotape
When I first tried to edit a documentary on videotape, I had quite a shock. My normal method of working was not viable. It was simply not possible to look at shots in the fast interactive way that I do on the PicSync (as I mentioned earlier, if the rushes had been broken down onto separate cassettes this might have been possible, but I did not try this). I had less freedom to test alternative strategies, and tended to accept an option simply because it seemed to work, rather than because it genuinely felt right.
Furthermore, there seemed no way to alter my pace in sympathy with the pace of the sequence I was cutting. The inexorable sequence of ‘roll back, roll forward, start the edit, finish the edit’ seemed to impose a rhythm on me which was more powerful than any I could impose on the material.
Everything I did seemed to be just a little pedestrian and one-paced. The cuts were often technically perfect; it was their cumulative effect which seemed inadequate. I am still not really sure why this should be the case, but I do have a few tentative thoughts.
I believe that good programme editing is founded on rhythmically coherent sequences, not on technically proficient edits. The editor responds to the intrinsic rhythms of the material at her disposal and to the overall editorial requirements of the program to produce as harmonious an ensemble as possible. It is a fact that no edit exists in isolation: change one, and both subsequent and preceding edits are affected and may also need to be altered. There is a sort of ripple effect; the closer edits are to a change, the more likely they are to be affected.
This means that each edit should be made with reference to the others which exist (if only potentially) on either side of it. The film editing system is fairly good at allowing this interaction to be made. Videotape, on the other hand, does not seem to allow it.
In fact it seems to me that film editing is sequence-based, while videotape editing is edit-based. This conclusion is prompted largely by experience, and seems to be connected with the ‘roll back, roll forward, start the edit, finish the edit’ sequence mentioned earlier. The length of time this takes seems to preclude proper consideration of the ‘ripple’ effects of an edit.
I accept that this may be in part due to my own relative lack of experience on videotape, but I am still fairly convinced that there is a real and inherent effect present, which represents a fundamental difference between the ‘syntactical units’ manipulated in film and VT editing.
A comparison of music sequences on film and tape might bear this out. It is my impression (which seems to be shared by a number of people from both film and VT) that music sequences on videotape tend to show a greater concern with synchronising the cuts to the rhythm of the music than similar sequences on film. On the other hand, the film sequences seem more concerned with establishing a relationship between the internal rhythm of the pictures and the music. I recognise that this is a subjective and unsystematic judgement but it is one which could be tested and made more precise if desired.
To the film editor, film offers freedom when compared with videotape. To most of us, the transition to tape offers some advantages in technique but involves an overall loss of flexibility. It is therefore a potential example of ‘de-skilling’ and would represent a serious threat to our craft skills, if it was likely to become a permanent system for the foreseeable future. Fortunately this is not the case. On the horizon is the promise of an editing system which far surpasses both film and VT, and it is to a consideration of this that I turn in the final part of this essay.
Part Three—The Future
Consider table three. It is the same as table two, except that I have added a third column, labelled ‘Digital Video’ and given this a set of scores which approach my ideal very closely. The exact design of such a system, especially at the hardware level, is something I am not qualified to comment on, but its broad outlines should be possible to discern.
In the ideal digital video system, all picture and sound information will be digitized and held in random access memory. Not only that, but information about the information will also be held. This second qualification is vital, because of the multi-medium nature of programme making. If I see a picture, I need to be able to find the sync sound which originally belonged to it, and vice versa.
Note, incidentally, that the reference information is not necessarily required for conforming or neg cutting, as is the case with current film and VT systems (VITCode, or key numbers). This is because digital video will in principle be an ‘on-line’ system: since there is no quality loss in operating digitally on digital information, any version will be of transmission quality. In practice, it may be necessary to work with a substandard picture quality in order to save memory, and then conform later.
I have suggested a score of 9/10 for the interactivity of the digital system. It could be argued that I am not being critical enough. For instance, one could envisage that a truly ideal programme editing system would respond instantly and accurately to a verbal (or mental?) message from the editor such as, “Show me all the sunset shots.” Clearly, this is a long way off, requiring significant advances in speech recognition and artificial intelligence techniques.
What is clear, though, is that the digital system has the potential to be significantly more interactive than any existing system. In terms of speed of access and flexibility of operation it promises to provide the opportunity to significantly extend the boundaries of the editor’s art.
Define a block
Not only have I put a ‘Y’ for the ability to define a block of any size, but I have also changed the previous affirmatives into negatives. Earlier I argued that the smallest block of interest to the film or VT editor is the frame. With the advent of digital techniques, it becomes clear that the smallest possible unit is actually the pixel (a abbreviated form of ‘picture cell’—one of the actual dots on the screen which go to make up the complete picture).
However, this in itself does not make it relevant to the editor. In part one I pointed out that it is not always necessary or desirable to move to the smallest possible unit, giving as an example the fact that it is not necessary to edit the individual letter shapes within a word processor. Is the same principle at work here?
I think not, as a simple example will show. A common occurrence in documentary editing is the jump cut in an interview. The editor wishes to join two pieces of interview, but the head size and position are almost identical on either side of the cut. This always looks ugly, and the usual solution is to try to find a cutaway—which often looks only marginally less ugly. But since we can do nothing about it, we accept the situation.
However, suppose that it were possible to make a smooth transition from one shot to the other, animating the speaker’s head from position one to position two, while keeping the lip movements accurately in sync. With the possibility of editing individual pixels, this becomes theoretically feasible.
Another common documentary technique is to edit a contribution in order to make it more succinct (this is often referred to as ‘cleaning up’ what the speaker has to say). At times this goes so far as to construct new sentences from existing phrases and individual words. Indeed, it is occasionally possible and necessary to make new words—“hand” from the ‘h’ of ‘hat’ and the word ‘and’, for instance. Digital editing not only offers the possibility of making such manipulations easier, it could also transform them altogether. For instance, it is not beyond the bounds of possibility to imagine that we could analyse the contributor’s voice patterns and then synthesise the required new speech in such a way that it sounded perfectly natural. If you coupled this with a corresponding synthesis of appropriate lip movements, you could make anybody say anything you wished them to—and because the techniques employed are digital it would be almost impossible to detect the deception.
The potential for unscrupulous agencies to use such techniques for the purposes of forgery or discrediting opponents should be obvious. But their use in general broadcasting also requires examination. Even though they are some way off, it seems imperative to me that there is a general awareness of what may become possible, and that the ethical implications are fully discussed.
I also feel that this discussion will inevitably have an impact on present practice—for instance, there seems to be a feeling among those I have spoken to, that our present technique of hiding a jump cut with a cutaway is legitimate, but that digitally ‘smoothing’ the transition is not legitimate. However, it is not easy to see the basis for this judgement and it may be that we are too lax at present in what we consider to be morally permissible. There isn’t space for further discussion of this topic here, but it is one which requires urgent and comprehensive attention.
In the ideal digital system all the relevant material will be stored in RAM which can be directly addressed by the central processor (or array of processors). Memory constraints may well mean that it may be necessary to make an preliminary selection of the material pertaining to the sequence to be edited. However, when that has been done it should be possible to access any frame virtually instantaneously (subject to satisfactory processor speed, of course). The time constraints will now be in the mode of selection—in other words, the limiting factor will become the efficiency of the user interface, a point to be discussed later.
Source & target
Strictly speaking, all operations in a digital system are of the ‘copy’ type. Nevertheless, provided the system can shift bits at a fast enough rate, it will be possible to simulate a ‘move’ operation in a way that is completely transparent to the user (as is the case with most existing computer word processing editing systems). Complete flexibility should be assured.
A digital system will be capable of all six fundamental block operations. (Again, this is not strictly true, but sheer processor and blitter speed should ensure a simulation of insert and delete which are completely satisfactory to the editor.) Furthermore, these operations will all be available for use on pixels or groups of pixels within a frame. Indeed, such devices are already beginning to come on the market.
I remarked earlier that videotape is capable of interactive merge operations which are impossible on film. However, I do not believe that this facility has really been used as it might. The advent of digital systems may change this. An example may suffice to indicate the sort of technique I am thinking of.
Some years ago I was working on a film for children’s television about herring fishing with a purse seine net. This technique involves finding a shoal of herring by sonar (the fishing has to be at night because that is when the herring come up towards the surface), and then steaming round the shoal, letting out the net. When the net surrounds the fish, draw strings are pulled in and the net closes around the shoal, securing them in the ‘purse’. The net is then drawn in, and the fish are sucked out of the water by a huge vacuum cleaner.
We had film of the events, but it was hardly self-explanatory. I therefore proposed that we use a split screen, with an animation of the process in the bottom half, and the corresponding live action in the top half. It worked quite well in the end, but it would have been much easier to do if I could have edited the sequence interactively instead of having to guess and then hope that it would be OK when the optical came back from the laboratory.
I believe that we have much to learn in the use of multi-screen and other merge processes. Indeed, at times I think that the crudity of current editing techniques could be compared to that of painting before the discovery of perspective: sometimes brilliant, but usually limited and lacking in vitality. The advent of digital editing may herald a similar breakthrough.
Once again, the digital system will be able to offer greater flexibility than any existing method. The sound track(s) will be completely independent of the picture, and slipping sound by any number of frames will be easy (and require far less processor power than shifting picture).
There is a huge literature on the human- computer interface and I do not propose to cover the subject in any detail. But it does seem worth making a few preliminary comments. I have already remarked that an analogue style of interface seems preferable to me, and one way of implementing this would seem to make use of a pointing device such as a mouse.
Research has shown that a mouse is superior to a keyboard (or joystick, trackball or cursor key pad) as a device for selecting items. Since selection of material lies at the heart of editing, the mouse would seem to be ideal for this purpose. A possible alternative would be a touch screen, although this would assume that the screen was close to the editor—something which is usually true in film editing, but less often so in videotape editing.
One of the advantages of the digital system is that it should be possible to implement a flexible interface which offers a variety of editing modes. For instance, during assembly, all the shots relevant to a sequence could be displayed simultaneously on screen, rather as is done with Slide File at present.
Pointing and clicking could start a picture running. Each picture could be made to zoom out to fill the screen if required. Edit points could also be selected with the mouse. Sometimes it would be advantageous to be able to examine a number of successive frames at once - for instance, when trying to find the best action point to make a cut. This can sometimes be done on film simply by picking it up and looking at it. On a digital system it would be even easier and certainly more effective.
The mouse could also be useful for other operations. For instance, on the display of the Sony BVE 900, a ‘split edit’ is shown graphically. It would be useful to be able to ‘drag’ the sound tracks ahead or behind the picture rather than having to input numbers on the keyboard (though that should still be an option for those who prefer it). This is common practice with Macintosh or GEM screen interfaces, and the principles could be extensively applied to digital editing systems.
Since some form of non-volatile storage (such as tape streamer or re-writeable optical disk) will be necessary, it will be possible to store intermediate versions for as long as is required. Multi-level ‘Undo’ should also be possible, even though memory considerations would probably preclude the kind of buffering usually associated with Undo on computer editing systems. Instead, an edit decision list could be stored and the state of the system configured according to the position chosen in that list (in other words, a kind of instant auto-conforming, requiring only the changes to be updated).
Quality & reliability
In the ideal system, quality would be superb, and apart from the occasional passing cosmic ray there should be no degradation of sound or picture throughout the editing process. In practice, though, it may be necessary to work with a substandard picture in order to save memory. In this case, a conforming process would need to be undertaken when the editing was finished.
Reliability of solid state devices is generally good, but I am simply not competent to speculate about the reliability of a system as complex as the one envisaged. It may be that my estimate is wildly optimistic, but I hope not.
Search and replace
With a digital system, many of my hypothetical ancillary support systems become a possibility. For instance, a kind of search and replace could be implemented by linking the shot film to a shot list. Searching through the shot list for, say, “sunset” should enable all the sunset shots to be displayed.
This presupposes that shot lists will be generated and stored electronically—something which still does not happen in the BBC. I am sure that it would save much time and money if all PAs and cutting rooms were issued with portable computers (such as the Cambridge Computers 288) and shot lists were entered on location and then transferred to the cutting room computer.
The amount of time wasted looking for ‘sunset shots’ is enormous. Looking further ahead, it may be possible to develop algorithms which will enable the editor to ask for dialogue or even pictures ‘similar’ to a specified target.
Edit list management systems already offer the possibility of automating some sequences of operations on videotape. These are the equivalent of macros, and their use could well become more widespread with the advent of digital systems. However, unless we are looking far into the future when artificial intelligence might make semi-automated editing possible (or at least assembly of sequences) there doesn’t seem to be a great deal of use for them.
With the advent of digital systems, it is possible to envisage on-line access to libraries so that stock shots would be almost instantaneously available and would suffer no quality loss in the copying. However, this would require significantly better information transfer rates than are presently available over the phone system—sending a few gigabytes of picture information at 1200 baud would not be very funny!
On-line help will certainly be possible with digital systems. It will also probably be very necessary, since their complexity will not be any less than current systems—although a more natural user interface will ameliorate some of the problems. VT systems are already so complex that editors can be faced with a piece of ancillary equipment whose operation is unfamiliar to them. A decent help system is actually long overdue.
The scope for error checking does not seem to be great, although that may simply be due to my failure of imagination. Nevertheless, there are some areas where it might be useful. Synchronization is one which comes to mind—a warning could be given if sound and picture were out of sync by a certain amount. Perhaps a figure of between one and five frames would be appropriate—more than that would be very noticeable and almost certainly deliberate (as in a piece of voice over).
In this paper I have tried to provide an objective basis for the discussion of film and videotape editing systems. I have showed that there are indeed differences between them, and that:
Finally, my perspective has been unashamedly editorial. I accept that engineers and managers may disagree with both my ideals and my assessments. My concern is that the editor’s voice should be heard when decisions about editing are being made—otherwise the craft skills that have been developed over the last eighty years may be diminished rather than enhanced.
Richard Seel 29th December 1988