Tips and Guidance for Students Writing Papers and Reports
Gernot Heiser


Contents


Introduction

This document originally grew out of frustration of having to fight my way through poorly written thesis and paper drafts, and was targeted at my own students. It developed (mostly grew) over time as I found new issues to address.

In the meantime, externals seem to use it too. I have consequently started to make it less specific to my own students, in the hope that more people benefit from the (accumulated over time) significant effort that went into this. Please send me feedback if you find any errors or have suggestions for improvements.

I get to read a fair amount of student prose, and the quality is highly variable, and some of it everything but a pleasure to read. In fact, some of it is very poorly written. Sometimes it's an early draft (where somehow the author thinks things will magically improve later), sometimes it is even a final (submitted) version of an undergraduate thesis.

I also do a lot of reviewing, mostly as a member of program committees of conferences. And sadly, some of those papers submitted for publication are not much better than drafts my students give me to read.

Such poor writing is annoying and counter-productive:

I've written up some general hints on technical writing, followed by more specific guidelines aimed at students writing their thesis. (This is mostly aimed at undergraduate thesis students, although I've assessed PhD theses which got a lot of the same things wrong, so PhD students may find this useful too.) I separately discuss writing conference papers.

Finally there is a section listing style and grammar issues I encounter most frequently in student prose, and some guidelines on how to do better. However, there is much more to get wrong, and I recommend getting a good style book. I generally follow:

Pam Peters. The Cambridge Australian English Style Guide. Cambridge University Press, 1995.

This describes the “official” rules in place in Australia (as far as there is such a thing). It often isn't specific enough for the purposes of technical prose. The following is an excellent book, geared towards folks like us. The author knows computing jargon, and knows her nerds:

Lyn Dupré. Bugs in writing: A Guide to Debugging Your Prose. Addison-Wesley, 1995

This book's main drawback is that it uses American rules, which are some times conflicting with Australian/British rules. She follows the official American rules to the dot, including where no-one else does, and occasionally produces bizarre results. The book is nevertheless very useful.

It is acceptable to use American rules (for papers), but only if you use them consistently. That means using all American spelling as well as grammar and style rules. Don't mix!

An interesting case is program vs. programme. The former is American white the latter used to be the British (and Australian) spelling. However, the Macquarie Dictionary since at least the beginning of the century considers program the correct spelling for all cases (not only computer programs), while the OED treats both equally. Facit: use program.

Another good (although highly incomplete) reference are the Notes for Authors editor Peter Salus wrote for the (long-defunct) Usenix Computing Systems journal. You can tell from the examples that this was written in the '80s, but the advice (except as it relates to the troff typesetting program) is still current.

The below hints are generally consistent with British/Australian rules, but sometimes narrower, particularly if this makes them consistent with American rules as well.

General Advice on Technical Writing

One of the most crucial aspects of writing a good technical paper is what I call maintaining user state. Like a good operating system, the writer should ensure that the (mental) state of the user (i.e. reader) is kept coherent. A good writer is fully aware of the relevant state in the mind of the reader at any point of the paper/report.

What do I mean with this? Basically it means that the paper systematically builds up the reader's understanding and knowledge of the work, starting from a reasonable initial state. This means you need to put yourself into the reader's shoes (or, rather, brain) and ensure that they can follow at each instance. One of the characteristics of good writers is that they do this well. Here is what this means:

There are probably more rules of this sort, I'll add them as I think of them (and feel free to suggest some to me). In summary, the more you worry about maintaining user state, the more readable people will find our work.

About (Honours) Thesis Writing

This section focuses on honours (undergraduate) theses, as they tend to be the ones needing advice. However, there are late-stage PhD students who are also well advised to read this (I've seen examples both when reading drafts from internal students as well as when assessing theses from other unis).

General advice

First rule is think before you write. Have an outline, know what you want to write about in each part, and how to approach it. If you start off with a brain dump, the final thesis will probably look like a brain dump. Not a good position from which to get a high mark... The section on structure tries to help you with this.

Also, be careful how you write. Ensure that the thesis is well-readable. This implies following the general style and grammar rules, violating those detracts the reader and makes the text harder to follow. These rules have developed for a reason. Also, check below for the proper use of “we”.

You may think I'm petty for insisting on proper prose. The reason I do it is because a report that ignores these rules is hard to read and annoys the reader. Making it hard to read wastes my time, and I don't feel I've got time to waste.

Many students have the attitude of “I'll write it down quickly and worry about the details later.” That's fine, as long as you worry about the “details” before you present a draft for feedback. Experience is they don't, and sloppy remains sloppy. I am yet to experience a case where I got to read a sloppily/carelessly written draft which ended up being a well-written thesis. These cases may exist, I just haven't seen them. Avoid starting off in the wrong direction! The section on typical mistakes tries to help you with this. Read it before you start, and read it again before you hand out a draft for feedback!

Also, take feedback on your draft seriously. This means not only blindly fixing marked-up issues, but think about the comments. Particularly if the same mistake gets highlighted repeatedly, think about why you make this mistake, and how you can avoid making it again. How else do you want to learn good writing? Writing your thesis is you job, not mine, I only provide feedback so you can learn!

Obviously, having a close look at a number of good thesis reports is a good idea. However, there are at least two problems with this: which of the reports posted on the DiSy thesis page (only accessible from within the cse.unsw.edu.au and nicta.com.au domain) are good, and, given that none is perfect, what should one look out for? There are obviously no marks posted, and even if you know that a particular thesis achieved a high mark, you can never be sure whether that was because, or despite the writeup.

There are no firm rules on how to write a thesis, and there is certainly a lot of advice available. I'll try to concentrate on a few main points (which tend to apply to a lot of technical writing, not only to theses but also to conference papers).

Structure and coherence

Make sure your thesis is well structured, that each major section does what it is supposed to do, and that the whole thing hangs together. The basic structure is often as follows (but other structures are possible).

In particular, don't think you need to have exactly as many major sections or chapters as there below list implies; sometimes it makes sense to merge things, sometimes it makes sense to move things (e.g. the literature review is in many papers deferred until after the results), sometimes it makes sense to split a logical part into several individual sections. Also, PhD theses often integrate multiple broad issues which are covered in separate chapters, each with its own background and related work. Use common sense!

Title
Use a descriptive title for your work. Don't use a title that promises more than you'll deliver, don't use a title that implies something different from what you've done. (The focus of a thesis often shifts in the course of a year, don't be afraid to adjust the title, in consultation with your supervisor.)

Abstract
A short (1–3 paragraph) summary of the work. Should state the problem, major assumptions, basic idea of solution, results. Avoid non-standard terms and acronyms. The abstract must be able to be read completely on its own, detached from any other work (e.g. in collections of paper abstracts). Don't use references in an abstract.

Introduction

Introduce the problem (gently!) Try to give the reader an appreciation of the difficulty, and an idea of how you will go about it. It's like the overture of an opera: it plays on all the relevant themes.

Make sure you clearly state the vision/aims of your work, what problem you are trying to solve, and why it is important. Especially, make clear you highlight the challenges you need to overcome. Having made it through the intro, the reader should have a clear idea of what to expect in the remainder of the work. This applies to the problem, the author's hypothesis, and (at a very high level) the approach taken. Leave out any of these at your peril!

While the introduction is the part that is read first (ignoring title and abstract), and it is generally a good idea to write it first, in order to define the roadmap for the thesis, it is important that you revise (and more often than not, re-write) the intro after the document is essentially finished, and all the results are in and understood. This is essential to ensure consistency.

Remember, the intro is the first thing that is being read, and will have a major influence on the how the reader approaches your work. If you bore them now, you've most likely lost them already. On the other hand, you make outrageous claims pretend to solve the world's problems, etc, you're likely fighting an uphill battle later on.

Make sure you pick up any threads spun in the introduction later on, to ensure that the reader thinks they get what they have been promised. Don't create an expectation that you'll deliver more than you actually do. That is called bait and switch, and tends to leave the reader highly dissatisfied. Remember, the reader may be your marker (of a thesis) or referee (of a paper), and you don't want to piss them off.

It's also important that you provide enough context and indication the limitations/assumptions of your work to avoid uprising (and disappointing) your reader.

Exposition of problem

The basic problem should have been stated in the intro, here is the place to go into detail.

Make it clear you know what you are talking about (and this includes being complete, don't jump right into things, give the reader a chance to follow). Give a thorough and complete discussion of the problem, enough so an educated reader whose speciality is outside yours can appreciate that you're trying to attack an interesting problem, and also appreciates what's interesting about it. (Remember to maintain user state!)

Btw, don't call this section“exposition of the problem”, or you'll be immediately exposed as someone who can only follow recipes. Same applies to the next bit.

Literature review (often called “related work”)

This is really important. If you cannot demonstrate that you know, and understand, what others have done, you only demonstrate that you're clueless. For an undergraduate thesis this, together with a thorough understanding of the problem, should be the result of the first session's work. It is an unfortunate fact that many students do very little work during the first session of their thesis. It usually shows here (and is usually reflected in their mark). Don't think you can fool your thesis supervisor/assessor. And don't even dream about fooling the referee of a paper. If you haven't done your homework here, it's probably not worth going any further.

In this part you demonstrate that you are aware of what's going on in the field, and how it relates to your particular problem. In an undergraduate thesis (unlike a conference paper) it may be ok to repeat work that has already been done elsewhere (usually in somewhat different circumstances). Be open, and explain why what you're doing is still worthwhile. In the more normal case that you're doing something that hasn't already been done, convince the reader that this is actually the case. One of the less convincing arguments goes along the line “a Google search on `frying giblets on StrongARM-driven toasters' didn't turn up anything”. You might as well pack up here. The way to convince the reader that your work hasn't been done before is to explain what has been done, what's different about what has been done, and, if you're good, why it hasn't been done already. There is always related work, and the more vague you are about it, the more obvious it is that you haven't done your homework. (And, no, looking at all the Google hits isn't enough.)

Sometimes some relevant background work is quite old; the discipline goes in cycles and it isn't all that infrequent that people rediscover things that have been done 30 years ago (virtual machines are an example). In such case please note that the language has changed a fair bit in the meantime. You're not doing your reader a favour of reporting an old paper's findings in that paper's language (and in the informed reader's mind you'll raise the suspicion that you don't understand what's going on). Talk about the work of the paper in contemporary systems language. This makes it easier to compare to other work, including yours.

Design of your solution

Having explained the problem, and what others have done in similar situations, now explain your approach. Again, give a general overview of your design first, and then go into detail (i.e. use a top-down approach). Make sure that the document (particularly a thesis) is self-contained: It should be possible for a reader familiar with the general area (that means operating systems, not methods for implementing free-block lists) to understand your design. (Remember to maintain user state!)

Discuss design tradeoffs before you present the design you have settled on, don't use the backward approach of “I'm doing it this way. I could have done it that way, but...” This smells of having been added as an afterthought. Show that you have thought things through, and convincingly show how and why you have arrived at the best solution.

Note that this may be an inversion of the approach you have taken in reality: You might have tried something, run into problems and then changed the design. Remember: your thesis isn't an activity report, it is the presentation of research. Which detours you took to arrive at the destination is primarily irrelevant (and in many cases just an admission of not having thought things through before you started). Focus on the outcome, not the journey!

It's not necessarily wrong to point out what traps you fell into, but present that in the context of a discussion of design tradeoffs. Sometimes the correct design may be impossible to determine a priori, making some early experiments essential. But that doesn't mean it should be presented as a history lesson. Discuss the alternatives, say what you did to investigate the implications, and then present your design decision.

Importantly, be forthright about the limitations and assumptions of your design. Also, make sure you justify any shortcuts/limitations convincingly

Implementation

In many (not all cases) there is a clear difference between the general approach (design) and its implementation in your particular circumstances. The design may be more general than what you can do given time and resources. Or you have developed a general design, and are now implementing a prototype on particular hardware. Or the design is relatively high-level but leaves open a lot of implementation questions. Avoid mixing up discussions of design and implementation! Design is first, implementation later.

Give all required details. It should be possible to understand all this without referring to the source code. (I generally refer to the source code to check whether the code is consistent with the report, I shouldn't have to do this in order to understand the report.)

This will, in general, include extracts of actual source code (or pseudo-code), basic algorithms, function prototypes etc. Don't list pages of C code, an electronic copy of the source should accompany the submission and should be available to the marker, so there's no point in killing extra trees to put it into the report.

Make sure you describe your implementation in enough detail. (Maintain user state!) Someone who has nothing else but your thesis report to go by should be able to repeat your work, and arrive at essentially the same implementation. Reproducibility is an important component of scientific work. Also, clearly state the limitations of your implementation, and justify them.

Experiments

A thesis almost always has an experimental part, typically some benchmarking. This is usually its weakest part. Many students debug their code less than a week prior to the submission deadline (typical indication of having started too late) which makes it hopeless to do any real benchmarking. Benchmarking takes time, for running the experiments, but also for thinking them up in the first place, and for analysing the results (and, inevitably, decide you'll have to do more benchmarks to clear things up).

Probably the majority of theses I mark is really deficient in this part, typically for lack of attention (often resulting from a late start). Think about what makes sense to measure, what you want to learn from your measurements. Think about what is really the relevant contribution of your thesis, and how you can prove that you have achieved your goals. Think about what you can measure in order to get a good insight into the performance of various aspects of your design, how you can distinguish between systematic and accidental effects, how you can convince yourself that your results are right. Most of this should have been done during Part A of your thesis, together with your project plan you should have decided what your success criteria are, and how to establish that you have met them.

If you get surprising results, don't just say "surprise, surprise, performance isn't as good as hoped". Find out why. Surprises without explanation indicate either that you are clueless about what's going on, or that you have made a mistake (most likely both). Unconvincing results tend to imply unconvincing marks. (Of course, this could be avoided if the results were available more than a couple of days prior to the thesis deadline.)

It is amazing how few students have even the faintest clue of the most basic statistics and their use. Measurements always have statistical (sampling) errors. Owing to the deterministic nature of computers these are sometimes very small in our area, particularly in the case of micro-benchmarks, where disturbing factors can be minimised. However, the reader should be given an indication of how statistically significant the results are. This is done by providing at least a standard deviation in addition to averages. Whenever the results of several runs are averaged, a standard deviation can (and must) be supplied. After all, you average to reduce statistical errors.

The reproducibility argument applies here just as much as for the implementation. Give enough detail on what you measure, and how you measure it, so that someone who has your implementation (but not your test code) or has re-done your implementation independently, should be able to repeat your measurements and arrive at essentially the same results. I read many theses which contain results which seem outright wrong. In most cases not enough detail is provided to allow me to pinpoint the likely source of the error. In many cases the cause is systematic errors resulting from an incorrect measurement technique. If it seems wrong, and the text doesn't convince me that it isn't wrong, I will assume that it is wrong.

Discussion

Discuss and explain your results. Show how they support your thesis (or, if they don't, come up with a damned good reason real quick). It is important to separate objective facts clearly from their discussion (which is bound to contain subjective opinion). If the reader doesn't understand your results, you probably do neither. And this will be reflected in the assessment.

Conclusions

Don't leave it at the discussion: discuss what you/we can learn from the results. Draw some real conclusions. Separate discussion/interpretation of the results clearly from the conclusions you draw from them. (So-called “conclusion creep” tends to upset reviewers. It means surrendering your scientific objectivity.)

Identify all shortcomings/limitations of your work, and discuss how they could be fixed (“future work”).

I repeat: don't stick slavishly to this structure. Also, remember that the thesis must be:

Also, a thesis isn't called “thesis” by accident: It is supposed to present a thesis you are making about some system, and your justification and confirmation of that thesis. This means that a thesis is not an experience report. You may have taken a few detours and explored a few blind alleys. Some of that might be valuable to document, but only for what general truths can be learned from it, e.g. what the pros and cons of particular design decisions are.

So, explain the facts (and what's behind them) but don't bother the reader with the details of you got to the end. I repeat, focus on the outcomes, not on the journey!

Kevin Elphinstone has written an excellent guide on how to write a thesis, which also contains further references. My physics colleague Joe Wolfe has written a PhD thesis guide from a somewhat different angle. And there is a wealth of info at the Online PhD Guide.

About Paper Writing

Paper vs thesis

Thesis and paper writing are related, they both are technical presentations of work done. The main differences between a paper and a thesis are:

General rules

As far as general writing style is concerned, I find it useful to think in the below Two Cs. Most papers I find poorly written (as opposed to technically deficient) fall down on one or both of them, or on structure (which is separately addressed below). Here are the Two Cs:

Be clear.

This should go without saying, but too many don't get it. Clarity applies at several levels, from individual words, to sentences, to paragraphs, to sections, to the whole paper. Make sure that at every level, what you are writing conveys a clear and unambiguous message.

For example, make sure that each sentence is unambiguous in its meaning. Make sure your formulation is precise.
Make sure every paragraph has a clear and unique purpose. Make sure a paragraph is homogeneous in its message (talks about one thing only) and paragraph breaks correspond to changes of points. When there is a point that requires a lot of text, so you want to break it into several paragraphs, make sure you break at a point that doesn't tear your argument apart. (Try to keep paragraphs to 2–4 sentences.)

Make sure each section (at any level) has a clear and coherent purpose and message. (Yet avoid sections getting too long, preferably not more than a single column, if at all possible not longer than a page, else break it up or sub-structure it.)

Most importantly, make sure the whole paper has a clear and consistent message. Make sure you understand what this message is! You should be able to state this message in one short sentence. Then stick to it, in the abstract, in the intro, in the body, in the conclusion.

It happens not infrequently that I read a paper, don't find any major faults (but typically not much excitement either) and at the end ask myself what I've learnt, or what the paper was trying to tell me. If I'm a reviewer, I'll argue for rejection in this case.

Summary: clarity at all levels!

Be concise.

In a thesis you can afford to be waffly (although it won't endear you to the reader), in a paper you can't. Reviewers expect a fair amount of meat in the 12 or so pages you've got, and if you waffle, you won't fit enough meat in. Furthermore, bloated prose tends to be harder to read. If something can be said in a sentence or in a paragraph, say it in a sentence.

Obviously, conciseness must not come at the expense of clarity, if shortening means loss of clarity, don't. Also, keep in mind what I said above about maintaining user state, However, experience shows that in most cases the more concise formulation is also the clearer one.

A common experience is that I edit a student's paper draft and in the process shorten it by 20–25%. In most cases the student will agree that clarity has improved at the same time (I call this “gainful compression”). A good method is “Jay's Rule of Thumb“, named after Jay Nievergelt, one of my early supervisors (I was pretty careless in my early postgrad days and lost a few of them ;-). It means you cover up parts of the text with your thumb. If it doesn't change the meaning, you know where you should cut.

Summary: Be brief but complete.

Besides that, it is important that the paper is readable. This means that reading it should be an enjoyable experience, not hard work! If it's hard to read, you're already in an up-hill battle, irrespective of the content. Try to write in a lucid, engaging style. Explain things top-down, not bottom up.

While good writing is a bit of an art, paper writing also is lot of engineering. And, like in engineering, if the product looks like it's sloppy and thrown together quickly, the reader becomes automatically suspicious that it may not hold up. And experience shows that sloppy presentation frequently goes with content that isn't very solid, keep that in mind while writing!

Make reasonable assumptions on the background and experience of the reader. Not every reviewer will be a complete expert on your area, but you need to convince them anyway! On the other hand, don't think you can hide something from reviewers, at least one is likely to be a real expert and perfectly able to spot any holes!

Also, make sure your paper is properly spell-checked and proof-read. If you're not a native English speaker, get someone who is to help with proof-reading. Sloppiness annoys reviewers!

It's a good idea to get a non-author to sceptically read it (as a reviewer would), this can help spotting holes before it is too late.

Excellent advice is also contained in a set of slides by Simon Peyton Jones from Microsoft Research Cambridge.

Expectations on systems papers

How papers are expected to be written depends a bit on the discipline. What characterises systems is that ideas are considered cheap, and may get you a workshop paper, but nothing more. For a real paper you need a system, and evaluate it.

The classical How (and How Not) to Write a Good Systems Paper, written by the PC co-chairs of SOSP'83, spells it out quite clearly, and with about 30 years of age is still highly relevant. In the end, it comes down to making a convincing case of solving a real problem.

Note that the emphasised words are all relevant: “solving”, “problem” and “real”. You need to identify a problem, offer a solution, and show that you actually solve the problem. Solutions looking for problems find very little interest in this community! And coming up with a cool idea may get you some brownie points, but it isn't good enough.

Structure

Technical papers tend to have a certain structure. In the systems community, there is a fairly clear consensus on what the structure should be. It is outlined below, with recommendations on what to put into the various sections.

Abstract

The abstract serves multiple purposes. In any case, it is supposed to give the potential reader a good idea of what to expect in the paper. If I come across a paper with a title that sounds like it might be relevant, I'll next look at the abstract, and then decide whether I'll look further.

For a paper which is submitted to a conference for review, the abstract serves a single purpose: to attract the right reviewers (and many others don't seem to understand this). Most conference PCs have a bidding process, where PC members express preferences for papers they want to review. These preferences are generally made on the basis of looking at the title and then the abstract.

The critical importance here is to attract the right reviewers, those who really understand (and appreciate) what you're up to. It's bad news to end up with reviewers who are only marginally qualified/interested. Your paper has the best chances if it is reviewed by the PC members with the most relevant expertise. (That is, if the paper is good. If it's not, then you shouldn't have submitted it in the first place!)

This means that you need to word the abstract carefully so it correctly sets the expectations on the content of the paper. Make sure that it allows the reader to judge whether your paper is more theoretical or practical, whether it is in their area of interest and expertise. For example, a paper dealing with OS security issues might at one end deal with formal security models, or on the other end with design and implementation techniques which decrease the attack surface. Both are important and relevant, but the best reviewers (from a given PC) are likely to be different people!

BTW, it's always a good idea to look at who actually is on the PC, and think about whom you prefer as reviewer of your paper. Then write the abstract to get them interested!

Once your paper is accepted, the abstract has a different purpose: it should contain the right keywords to direct searches, and give people an idea of whether it contains something of interest for them. That's similar, but different from the purpose of the submission version (you're now trying to appeal to a wider audience) so you might want to re-write.

Introduction

This is the most important part of the paper, certainly as far as writing is concerned. Here you need to convince the reader that you have identified a real problem (which includes motivating why it is relevant!) and outline your approach to solving it. And make it clear that you have actually solved it!

It is often a good idea to write the intro in two steps. Write it before you write anything else, this will define where the paper is heading. Then, at the very end, go over it very carefully to ensure that it is still consistent with the rest of the paper. In particular, ensure that the intro doesn't promise more than what the paper holds. Reviewers get very angry with bait-and-switch papers (see below)!

Make sure that the intro is concise, yet interesting, highly readable, and complete. It should not be longer than about a page, if it does, you're probably putting too much detail in, leave that for later. Use the intro to wet the reader's appetite: make them want to know more (but don't let them guessing whether the paper is in their domain of interest).

Also, try to put a diagram/figure on the first page. This immediately makes the paper look more appealing. Obviously, the figure must be related to what you are presenting and help understanding, else it's a useless filler. Good examples are diagrams (eg from measurements) which highlight the problem you're trying to solve, or an indication of your results. And it should be referenced on the first page, else it belongs to where it is referenced.

Avoid buzzwords, over-the-top statements and outrageous claims. This applies to the whole paper, but the introduction section is particularly prone to over-selling. This makes the reader suspect and frequently annoyed. Examples of buzzwords are “new”, “novel”, “innovative&rdquo, etc. These are useless fillers and should be avoided! If your work isn't novel, you wouldn't be trying to publish it, right?

One particular kind of over-selling is called “bait and switch”: promising (or appearing to promise) in the intro great things but then deliver only a subset or something completely different. This happens a lot, and is an excellent way to bias reviewers against you!

Then cut to the chase! After getting an idea what you're trying to sell me, I want to know how it works. At least a general idea of how you solve the problem should be presented in the intro. I want to see that what you are saying makes sense. It is extremely annoying having to read through lots of cruft to find out how it's supposed to work (especially if you fail to convince the reader that your approach will work). That's typically a death sentence for your paper!

Then I want to get an idea of the outcomes of the work. That means a brief, high-level description of your results. But the outcome is more than the experimental results, it's the general contribution the paper makes. List your contributions explicitly, best done in a bullet list, with forward references to the sections in the paper where I can find them.

In summary, the intro must convey that you meet the general criteria for a systems paper: identified a real problem (motivate that it is real and interesting), come up with a solution (give a rough idea what the solution looks like) and actually solved the problem (high-level summary of results).

An utterly useless (despite its popularity) part of the intro is the paragraph starting “the rest of the paper is structured as follows”, and frequently ends “and we conclude in Section 5.” What a pointless waste of real estate! There is no useless info in such a paragraph, particularly if you followed my advice of having a bulleted list of contributions with forward references.

Background

This is where you go into details about the motivation for your work, and any other background required to understand it. The length of this is very dependent on your paper, it's hard to give a general guideline here.

This section will contain a lot of references to related work. Some people will therefore have a related-work section right after the intro (I have done that sometimes too). However, in most cases that's not a good idea. Focus on providing the background needed for the rest of the paper and get to the interesting bits as quickly as possible, and having to cover and discuss everything that's related only distracts from that mission. So, it's generally better to put the related work section at the end. However, make sure you properly cite every bit of background you introduce!

What you've actually done

Of course, you won't use that as a section title, but this is what the middle part of your paper is all about. It's typically broken into two sections, eg concepts/design and implementation, but it can be more or fewer sections.

The best general advise here is to be concise, precise and easy to follow. This sounds like motherhood and apple pie (and is to a degree) but you won't believe how many submissions can't get this right. Remember, a PC member might read 20–30 papers. Keep them interested. If a paper is boring, or hard to follow, it's got an uphill battle to be accepted.

Use diagrams where possible to explain things. Good diagrams help the reviewer getting though your paper quickly, without missing important content. If it can be explained with the help of a diagram, it should.

In the PC meeting, where the fate of papers is decided, papers without a “champion” (someone who really likes a paper and wants it published) stand a poor chance to survive. Do your best to ensure that there is at least one PC member who'll champion your paper!

Evaluation

This is where the rubber hits the road: You now have to prove that you've actually done it (solved your problem) and done so in a convincing way. This means finding the right evaluation criteria—meaningful benchmarks which demonstrate that you have something useful. It also means looking good on those benchmarks.

I intentionally said “prove”: The evaluation isn't about just going through motions of showing some numbers, it must instill confidence in the reader that your solution really is what you claim it is. If you are not doing your best to show that your solution is up to scratch in every respect, the reviewer will suspect that you're trying to hide something. So, your evaluation needs to be pro-active in a sense that it needs to anticipate what problems the reader might suspect, and deal with them head-on.

Select your evaluation scenarios carefully and convincingly. Don't artificially construct best cases, they will be discounted if they don't present a convincing scenario.

Also, keep in mind that any improvement must satisfy a progressive as well as a conservative criterion (and your evaluation must show this). The progressive criterion requires that you demonstrate significant improvement with respect to the problem you have identified. The conservative criterion requires that you demonstrate that you have not worsened the situation in all the other circumstances people may care about. For example, if your work speeds up certain system calls, I want to be convinced that this is not at the expense of other system calls (or a strong argument why this doesn't matter).

This means that you need to think carefully about worst-case scenarios for your approach, and show that they aren't too bad. Go out of your way to be fair, and think of scenarios an adversary (who wants you to look bad) might come up with. Reviewers appreciate thoroughness here.

Above all, be honest, and be seen as such. I maintain my list of benchmarking crimes separately, make sure you stay away from those!

Related work

The most important role of the related-work section is to show that you know the field, and are familiar with all the relevant contributions made by others. Take specific care not to omit any work by likely reviewers! It's a good idea to go through the list of PC members and think about what they worked on in the past, and whether any of that work might be worth citing. Reviewers get pissed off if they think that you're re-inventing something they did before.

But stay clear of anything that might look like an attempt to bribe reviewers with citations. If it looks like you only cited my paper because I'm on the PC, and the paper isn't really relevant, I won't think highly of you. Good judgement is important!

Don't fall into the trap of trying to make prior work look bad in order to justify your own. While it is true that some bad work gets published, and occasionally some badness provides the motivation for your work, be very careful there. State, as neutrally as possible, what the prior work has achieved, and, where relevant, its limitations.

For example, saying “Doe investigates core temperature but fails to account for load fluctuations” implies that Doe stuffed up. You really only want to say that if you (a) think they stuffed up and (b) really want to make that point (“a courageous decision“ as Sir Humphrey would say!) Else a better formulation might be “Doe investigates core temperature under the assumption of constant load”.

The normal assumption is that the prior work is good, and you're taking it further. It's OK to admit you're standing on the shoulders of giants! Also, remember that the author of the work you cite might be your reviewer. Little infuriates a reviewer more than the feeling you misrepresent their work, or you don't understand it. Biasing reviewers against you isn't a smart strategy.

An important aspect of this last advice is that you must have carefully read and understood the work you are citing. Don't cite a paper just because it's the standard reference and everyone cites it. Read it. Carefully. Failure to follow this rule increases the likelihood of misrepresenting the paper and annoying your reviewer (even if they aren't the author of that paper, if it's the standard work you can safely assume that they have read it and understand it!)

Conclusions and Future Work

This is where you summarise what you've achieved. This is a bit like some of the later parts of the intro, but different. Now the reader knows everything, and this is your last chance to press what you think are your main achievements. Don't over-do it, and be brief! Also, re-visit some of the limitations and what can be done to address them. However, don't promise anything if you have no intention to deliver!

Formatting

Conferences tend to be quite prescriptive about the formatting of submissions. Observe all formatting rules! In particular:

Some Things People Frequently Get Wrong

This is my list of things that people most frequently get wrong, listed in no particular order, except that the most annoying ones are at the top. I'll keep adding to this from time to time.

If you are my student, I expect that you have read this, and have checked that any draft you give me to read observes the advice below. If not you'll get it back with not much more than “RTFSG;” printed on it.

Passive Voice
Overuse of passive voice is one of the most annoying mistakes I see in undergraduate theses. (And it seems it's particularly prevalent among EE students — is someone teaching you this???) Whatever the reason, stop it!

Overuse of passive voice is very poor style. It makes for a very boring read, and it creates the impression that you are not really taking responsibility for what you've written. If 1/4 or more of your sentences use passive voice, your prose is poor.

A typical occurrence (especial in U/G theses) is the use of passive voice as a way to avoid the first person, e.g., “a suitable protocol was designed to cope with that situation”, when the student means to say that they designed the protocol. This might be a case of shyness, but it comes across as trying to avoid responsibility for one's actions. At best it leaves the reader puzzling who had actually done the work. Show through your writing that you assume ownership and responsibility for what you have done, and make it always perfectly clear what you have contributed and what came from others! And yes, a thesis (like a paper) uses the first person plural.

Just to make it perfectly clear, I will mark you down for excessive use of passive voice in your thesis. No matter how good it is otherwise!

Buzzwords
Buzzwords are annoying to the informed reader and should be avoided, they create the impression of bragging (and often outright cluelessness). In my former life I found that editors of physics journals would systematically remove words like “novel”, “new” or ”innovative” in the title, abstract or intro. For good reason: if it ain't novel, why are you trying to publish it? In fact, you're creating the impression that doing something novel is unusual for you. Shy away from such words!

Similar with terms which are popular in the trade press but rarely used in scientific work—blend in with the standard terminology of the community.

Chart abuse

The term chart abuse was coined by Martin Gardner, who for a quarter of a century wrote the brilliant “Mathematical Games” column in Scientific American. It refers to all sort of ways of using graphs in an (intentionally or not) misleading way.

Chart abuse example

A typical example is shown in the figure on the right. Whatever the quantity on the abscissa is, you're likely to have the impression that varying that variable has a dramatic impact on whatever the ordinate quantity is, after all, it goes from almost full to almost empty, right? Of course, if you actually look at the units, you see that the dependent quantity varies by only 21%. This may or may not be significant, but it isn't anywhere near the rough order-of-magnitude change the graph seems to show on a cursory glance.

Note that not every such graph represents chart abuse, it depends a lot of the reader's expectations, that discussion of the data, what is shown in other graphs. Just relying on showing the units may not be enough, as images can be very persuasive. But you must be aware that this graph is not showing the full story, and you need to be extra careful that this does not leave the reader with the wrong impression!

Another case of chart abuse is the gratuitous use of logarithmic scales. It's a great way to make execution time increases look insignificant. I won't fall for it!

Someone put together a nice collection of bad graphs, all good example of how not to do it. (But I think their claim of “top ten worst graphs” is an exaggeration, I've seen worse! I guess I'll have to start my own collection, stay tuned...)

Spelling
There is no excuse for presenting a draft that hasn't gone through a spell checker. If you're too lazy to do this, then I'm too lazy to read your work. Period. And if I have to read it (because I need to mark your thesis) you'll see the result in the mark.

Apostrophes
Incredible how many people cannot use them correctly (and I suspect that it's often laziness). That's pretty much it (says Bob). But keep in mind that apostrophes are actually useful, so don't leave them off completely!

See also acronyms

Capitalisation
Don't Randomly capitalise Words. Looks Ridiculous, doesn't it?

Capitals are used for:

Capitals shouldn't be used for definitions, and even less without any obvious reason.

Note that (contrary to many “official” style guides) in scientific publishing (yes, that means you) numbered section, figure etc. references in papers are treated as proper names: In the next section we introduce the problem, and in Section 3.1 we demonstrate how to solve it. (By the way, note that the reference to a sub-section still calls it “Section”!)

Commas
This is probably what I get most often wrong myself (partially because of totally different rules in German and English). I quote the basic rules from Peters, but skip the detailed explanations. If someone wants to copy them from the book, be my guest.
[Commas] have a vital role to play in longer sentences, separating information into readable units, and guiding the reader as to the relationship between phrases and items in a series.
  1. A single comma ensure correct reading of sentences which start with a longish introductory element: Before the close of the last Ice Age, Tasmania was joined to the mainland of Australia.
    [ ... ]
  2. Pairs of commas help in the middle of a sentence to set off any string of words which is either a parenthesis or in apposition to whatever went before.
    The desert trees, casuarinas and acacias, were sprouting new green needles. (Apposition)
    The dead canyons, all nature in them reduced to desiccation, came alive with the sound of rain slithering down the crevasses. (Parenthesis)
    Note that a pair of [em-]dashes could have been used instead of commas with the parenthesis, in both formal and informal writing.
  3. Sets of commas are a means of separating:
    1. strings of predicative adjectives, as in: It looks big, bold, enticing.
    2. items in a series, as in: The billabongs at sunset drew flocks of galahs, gang-gangs, budgerigars and cockatoos of all kinds.
      A curious amount of heat has been generated over whether there or not there should be a comma between the two last items in such a series (the so-called serial comma debate). [ Details omitted, summary: don't put it except where required for clarity. US rules are strict here (but are ok to ignore). ]
[ ... ]

Colons (and lists)
Colons are used to indicate that examples or specific details are to come: Note: US rules differ.

Period (full stop)
The period is used to end a sentence, as well to identify an abbreviation. The two are actually distinguished in type-setting: a period designating an abbreviation (and nothing else) is followed by a normal inter-word space, while a period at the end of a sentence is followed by a longer inter-sentence space. Many formatters (incl. web browsers) automatically produce an inter-sentence space after each period; this is wrong if it is not the actual end of the sentence, and must be overwritten by forcing an inter-word space (e.g.  in HTML say “NICTA Ltd. is headquartered in Sydney”). LaTeX does it right for abbreviations ending in capitals, but otherwise the period must be followed by a backslash.
Quotation marks
There isn't complete agreement on that in the British-speaking world. I recommend the following rules, which are compatible with the British as well as the (stricter) American rules:

Definitions/introductions of new terms
Use italics when introducing new terms. This makes it easy for the reader to find the definition again, particularly when not having the time to read the paper in one shot. Do not capitalise words when they are introduced (unless you'd normally capitalise them). Do not put them in quotations marks (see above).
Acronyms and Initialisms
Technically the difference between the two is that acronyms you pronounce as a word (NICTA) while initialisms are pronounced as individual letters (UNSW). The distinction is hardly ever made and both are generally lumped under the general term of “acronym”, as in the reminder of this document.

Properly define all acronyms on first use (except maybe those really everybody knows, such as CPU). An acronym is normally introduced by following the full term by the abbreviation, as in address mappings are cached in the translation lookaside buffer (TLB). The other order (use the acronym and put the expansion in parentheses) is occasionally acceptable if that helps the flow, but it should really be an exception. Don't introduce too many acronyms, and use standardised ones whenever possible.

Don't introduce acronyms in headings! If a term for which you want to use an acronym appears first in a heading, define the acronym on the next appearance (the first one in paragraph mode). Also, don't introduce an acronym which is then not used for a long time. In such a case it is also better to defer the introduction of the acronym.

It sometimes happens that an acronym is introduced and used more-or-less heavily in an early part of a thesis or paper, is then not used for a long time, until it is used again much later. Remember that the reader may not read the whole thesis or paper in one go, and may have forgotten what the acronym stands for. In such a case (at least if it's an acronym that isn't widely used) it's better to re-state the definition when the term starts appearing again. A very gentle way to remind the reader of the meaning of an acronym is to use it just after its expanded form in a way that makes its meaning obvious. Example: In this paper we only consider the priority-inheritance protocol. We chose PIP because.... This is obviously only acceptable if the acronym has been introduced before. Basic rule: Be nice to the reader!

Acronyms are normally all upper case, however, in our discipline mixed case acronyms have become very common (e.g., QoS for quality of service). They should still start with a capital letter. Acronyms are almost never all lower case. The one exception is units of measurement (e.g. loc for lines of code, although journals would normally use LOC for this). If you find an all-capital acronym too imposing consider using SMALLCAPS. However, remember to be consistent: if you decide to use a special font for something like a specific acronym, make sure you always use the same font for the thing. Also, don't go overboard with fonts, kindergarten documents are hard to read.

What's the plural of CPU? CPUs or CPU's? The answer is clear (notwithstanding many people getting it wrong): CPUs is a plural while CPU's indicates a possession or attribution. Example: Of the system's two CPUs, only one was operational. The second CPU's power supply had been disconnected.

A special case of this is acronyms ending in s, e.g. OS. I have found a (seemingly authoritative) reference which claims that in this case you need an apostrophe, but Peters has no such special rule, and I really don't see why there should be one. I strongly recommend OSes over OS's for the plural, in order to clearly distinguish it from the possessive case. Note that UNIX is traditionally pluralised as UNIXen, like oxen, but I think that's tradition rather than a grammatical rule.

In rare cases using no apostrophe for the plural might create confusions with mixed-case acronyms. In that case use an apostrophe if you really think that it improves clarity.

Units of measurement and their prefixes
Computer people are particularly notorious (others would say clueless) with respect to improper use of unit symbols. I regularly see “KB”, “kb”, “Kb” all (intending to) refer to the same thing (1024 bytes), all wrong. Specifically:

So, bit is “b”, byte is “B”, kilo is “k”, not “K”. Furthermore, the unit prefixes “k”, “M”, “G”, etc. strictly refer to powers of ten, i.e. 103, 106, 109. In IT contexts they frequently stand for powers of two, i.e. 210, 220, 230. This is of course confusing. If you think it is not, can you confidently tell me whether a Gigabit Ethernet is supposed to have a bandwidth of 109 b/s or 230 b/s?

There are in fact proper SI prefixes for power-of-two multiples: “Ki”, “Mi”, “Gi”, etc. Use them systematically!

Headings
Capitalise or not? Generally speaking, only top-level or, for larger documents, second-level section headings should be capitalised. For other headings capitalise the first word (of course), but otherwise nothing you wouldn't capitalise in normal text. If you capitalise words in a heading, only do so with nouns, adjectives, pronouns, verbs and adverbs.

Excess digits (pseudo accuracy)
A common annoyance is people quoting results with three or four digits accuracy, when the real accuracy is at best a few percent. For example, I commonly read statements like “we observed performance improvements of up to 27.81%.”. This pretends that the improvement figure is accurate to about one in 10,000. Of course, it's nowhere near that! It's the difference of two other figures (the baseline and the improved system), and the uncertainty in the difference is no better than twice that of the two values. This is misleading, as it gives the appearance of something (accuracy) that isn't there.

And even if the number was really that accurate? Does it matter whether your improvement is 27.81% or 27.82? What counts is the (binary) order of magnitude: whether it's around 15% or around 30%.

At least in this case everyone can immediately see that this is pseudo accuracy, and will mentally drop the n-1 excessive digits. That's not the case in tables where you present your actual results, which makes this a worse offence. As I argue in my discussion of benchmarking crimes, results must indicate the significance (accuracy) of data, typically by stating standard deviations. But don't undo this by pretending more accuracy with excessive digits! If you show three-digit results, I expect an accuracy well below a percent. And I get annoyed if I think you're trying to fool me!

Note that standard deviations (or other kinds of errors) are a second-order effect (just as the relative improvement discussed above). As such, they are only relevant to one digit! Stating absolute standard deviations to three digits is nonsense.

A good (or bad, as you look at it) example of excessive digits in the table that gets an honourable mention in the list of bad graphs discussed earlier. The discussion (rightly) also makes the point that trailing zeroes should not be suppressed where they carry information, i.e. are within the accuracy of the data. (For some reason, this only this only seems to affect people who write their papers in Word...)

Footnotes
First rule: use them sparingly. Many disciplines (especially humanities) use them for citations, we don't. Footnotes are used for information which is useful, but is not essential for understanding the argument, and including it in the text would disrupt the flow (similar to parentheses). If you use more than about one every few pages, there's probably something wrong with your prose. Most papers get away without a single footnote.

Second rule: Footnotes should be fair-dinkum sentences, able to be read by themselves. A footnote like 5kB is a definitive no-no. Something like #define'd to 5kB. is very bad. Good is The buffer size is defined to be 5KiB. (Except that anyone using a 5KiB buffer should be shot.)

Since footnotes are sentences, it doesn't normally make sense to put them into the middle of a sentence. In particular, this means the footnote follows, rather than precedes, any punctuation. An example of correct use is We use the Fancy tool.\footnote{Fancy can be obtained from \url{http://fancy.org}.} Placing the footnote before the full stop is incorrect.

Hyphens, en-dashes and em-dashes
These are three kinds of dashes used in text:

Split infinitives
Remember to never split infinitives! :-)
According to Peters that's a bullshit rule. It's often more elegant/readable to split the infinitive, so go ahead if it avoids clumsiness, but use it sparingly to avoid upsetting old-fashioned people.

Specific terms or phrases
Like vs. such as
When you are referring to a set, the members of which have in common a given characteristic, and you wish to give an example that is a member of that set, you should use such as. When you are referring to a set that does not include your example, but that contains members that resemble your example, you should use like. Examples: Students, such as those at UNSW, sometimes are having fun. Sometimes they behave like children with a new toy. (Note that British/Australian English is more relaxed about this rule than American English.)

Spaces
Some people add spaces in the weirdest places. I don't remember all of them, but came across another annoying case so I decided to start a spacing blacklist here. Stay tuned for more entries ;-)

Before the colons in definition lists
Doesn't belong.

Some go the opposite way and omit spaces where they should appear, e.g.:

Before parentheses
Why should an opening parenthesis be glued to the preceding word? No matter whether this introduces an acronym or a non-essential remark, the outside of the parentheses like air to breath.
Before a unit of measurement
Units of measurement are spaced off the preceding number. (And a percent sign is like a unit in this case.) However, a full space generally seems too much, so I recommend using a half space (e.g. LaTeX 100\,Hz gives 100 Hz), which also prevents a line break between the number and its unit.

Inclusive vs Royal “We”
Scientific literature is written in the first person plural (“we form”), and theses are no exception. This is meant to include the reader in the proceedings (“we” in the sense of “you and I together”). However, used wrongly it will sound odd, especially for a single-authored work (such as a thesis), sounding like a royal we (“we, the king of this realm”) and thus pretentious.

So, use it in a way that takes the reader with you. Examples: We will now look at the dependency of power on load. Or: We run the system at its maximum performance setting and measure core temperature. We obtain the results shown in Figure 5.

In particular, this means that you should be using present and future tense, and generally not past tense / present perfect. A statement like we obtained the results shown in Figure 5 is a royal we. The reader wasn't around when you did this, so the “we” can only refer to yourself, making it an obnoxious royal “we”.

Citations and Bibliography
Should you use numeric or alphabetic citations? Some conferences or journals have clear rules on this, so you'll obviously have to follow them. Conference papers are usually squeezed for space, so using numeric citation labels tends to be used as a space saver.

In all other cases, use alphabetic citation labels as these greatly enhance readability. At the least this should be the BibTeX alpha style. Particular for a PhD thesis, which has many citations, most of them familiar to the examiner, having meaningful alphabetic labels massively reduces the need to consult the bibliography. The recommended form of citation for theses (where you have no space problem) is author name and year, as in: [Smith 2008] or [Murphy and Chaplin 1999]. Your examiners will appreciate it!

Abbreviating conference and journal names (as done by the groups defs-abbrev.bib) is acceptable where you have tight space constraints (typically for conferences and workshops) but not acceptable elsewhere. Least of all in a thesis, where you have plenty of space!

An interesting question is where to put the bibliography (references section). It goes at the end, of course, but if your document has an appendix, does it go before or after? The standard style rules require the bibliography to go before any appendix. I personally think that this is a pain, as when reading a thesis or book, you tend to refer to the bibliography frequently, and the appendix infrequently, and having the bibliography at the very end helps accessing it. (External examiners of some of my students' theses have told me the same.) Therefore, in this case I recommend deviating from published rules and by putting the bibliography behind any appendix.

Independent of citation style, the following rules should be followed:

Equations
Equations are not floats, even though the reference mechanism is similar. Instead, they are considered part of the prose. Hence, don't refer to an equation as you would do to a figure or table, but make it part of the sentence. Example:
The dynamic power is given as
   P = c f V2,   (1)
where f is the core frequency, V the core voltage, and c a constant.
The equation number is only needed to reference an equation from another place in the text, and can be omitted if no such cross reference is required.

Miscellaneous
Various tidbits:

Formalities
This should go without saying, but, apparently, doesn't: Only exception is that camera-ready conference papers often are required to be submitted without page numbers. This shouldn't stop you from using page numbers in drafts, as well as in submissions for reviewing (reduces the chance of a reviewer messing up your paper while reading).

Microsoft Word

Short summary here: don't use Word for technical papers!

Of course, it's up to you what you use to write your thesis or single-authored paper (although it portrays a strong streak of masochism to do so, just look how many people in Microsoft Research are writing papers in Word—I don't know of any!).

But for a paper where you collaborate with other authors, Word is a no-no. Besides the usual stuff (that it's easier to format a paper in LaTeX etc) there is the problem that Word doesn't integrate with the usual revision-control systems, meaning no automated merging of concurrent updates etc. And latexdiff is at least as good as Word's compare-documents function.

Some TeXniques

Here are a few useful hints from the TeXperts:
Italics
LaTeX command \it is almost always the wrong way to use italics. Use the LaTeX \emph command, which will handle nested emphasis correctly. Also, check the section on Math fonts below!
Citations/BibTeX
A few general BibTeX (and \cite) tricks:
URLs
To represent URLs, don't just use \texttt{url} (which causes problems with the tilde character) or \verb|url| (which tends to produce vastly overfull lines). Instead use the command \url{url}, available with the url package. This will, by default, typeset the string in TTY font, but that can be changed to the more readable \urlstyle{sf}.
Graphics
Don't use bitmap formats for figures (nor bitmaps converted to EPS or PDF). They almost always lead to poor results.
Math fonts
Typesetting mathematics is a traditional strength of TeX. However, it is optimised for the more traditional kind of maths, where, besides a small number of predefined functions, people use single-letter variable and function names. In such a context it is customary to interpret a string like “abc” as the product of three variables “a”, “b” and “c”.
In our discipline we use a lot of multi-letter function and variable names, as we are used to from programming. Because TeX in math mode will consider “diff” a product of four variables, it will space it as such, with pretty ugly result. To avoid this, use the following rules: Simple rule: Don't put words (of more than one letter) in pure math mode.

Some Auxiliary Material

My talk on paper writing

Here are the slides of my standard talk on how to write a good (systems) paper which I frequently give to research students and early-career researchers.

On evaluation and benchmarking

If you are a systems researcher, you might also be interested in my list of Benchmarking Crimes. There you'll also find a talk on evaluation and benchmarking.

And finally a nice example (from the Unix fortune cookie program)

Rules for Writers:
Avoid run-on sentences they are hard to read. Don't use no double negatives. Use the semicolon properly, always use it where it is appropriate; and never where it isn't. Reserve the apostrophe for it's proper use and omit it when its not needed. No sentence fragments. Avoid commas, that are unnecessary. Eschew dialect, irregardless. And don't start a sentence with a conjunction. Hyphenate between sy-llables and avoid un-necessary hyphens. Write all adverbial forms correct. Don't use contractions in formal writing. Writing carefully, dangling participles must be avoided. It is incumbent on us to avoid archaisms. Steer clear of incorrect forms of verbs that have snuck in the language. Never, ever use repetitive redundancies. If I've told you once, I've told you a thousand times, resist hyperbole. Also, avoid awkward or affected alliteration. Don't string too many prepositional phrases together unless you are walking through the valley of the shadow of death. “Avoid overuse of ‘quotation “marks.”’”

Gernot's Home ... Research Group Home ... Student Projects


Gernot Heiser, gernot@unsw.edu.au.
Created 2001-08-24, last modified 2014-03-31, last validated 2014-01-23.
Valid HTML 4.01!