Stop your CSCW paper from getting rejected.

From blog post by Antti Oulasvirta

Feedback icon
Peer feedback

Upload your paper and have a peer give you feedback. Price: Free! (you can give feedback too) [Learn more].

Talk icon
Statistics help

Schedule an online coaching session with an Ink approved expert who will review your statistical analysis and give you feedback.
Price: $20/half hour [Learn more].

Feedback icon
English writing

An Ink approved expert in academic writing will give you feedback on the writing and flow of your paper. Price: $3/page [Learn more].


Most meta-reviews state up front that the topic of a submission is timely and/or important - so what’s the matter?

Every year over a thousand CSCW papers are rejected. Follow this tutorial to make sure you are not falling into the common pitfalls of most rejected papers. This tutorial is based on a blog post by Prof. Antti Oulasvirta who has been on the committee for HCI conferences. The original list was for Usability and UX papers at CHI, so specific criteria may differ a bit for other sub-committees.
Have suggestions for other kinds of CSCW papers? Let us know:

research strategy

Research strategy refers to the selection of best possible method to address a given problem.

Here is a checklist to make sure you have the right research strategy:

(e.g. why a laboratory experiment and not a field study).

(e.g. why were users given a particular kind of feedback after each task?).

For example, a survey may have limited use for studying real-world practices of users.
Not sure about one of these? Ask a peer for help! [Learn more].

statistical conclusion validity

Statistical conclusion validity refers to the reliability with which we can infer a relationship between two or more variables.

The following validity-related categories follow the taxonomy of Cook and Campbell (1979). Make sure:

Sometimes authors simply report descriptive statistics per user/group/condition, and draw conclusions based on means only.

This is a common reason for rejection and can refer to mismatch of test with experimental design (see Kirk, 1995), mismatch of test with levels of measurement (nominal, ordinal, interval, ratio), violation of test’s assumptions.

Take caution if you do this. Either the reader must then familiarize with the test or take a leap of faith; both may result in frustration.

For many reviewers failure to do this is a show-stopper.

Reviewers know that absence of evidence is not evidence of absence. Following APA’s recommendation of reporting statistical power would protect authors from this criticism.

Omnibus testing is not sufficient when one wants to pinpoint effects for a variable(s) with multiple levels.

Statistical testing for multiple variables (or their levels) should utilize the correct post hoc test that accounts for inflated probability of Type 1 error.

Sometimes reviewers find it irritating that authors report a whole bunch of significant effects, but concentrate on only those that are relevant to their conclusions.

A recurring issue is that authors following “grounded theory” code highly subjective categories but ignore inter-coder reliability assessment. Another example is use of noisy logging data.
Not sure? Ask an expert:

Statistics coaching session

Submit a request to have a stats coaching session with one of our experts. Your coach will ask about your test design and results and will help you fix anything that's wrong. Price: $20 for a half hour session



Questions? Email us at:

J swartz
Dr. James A. Swartz, is a clinical psychologist and Associate Professor in the Jane Addams College of Social Work at the University of Illinois at Chicago. His research interests are in studying the epidemiology of co-occurring mental illnesses, substance use, and medical conditions. He teaches basic and intermediate regression to doctoral students at the College.

The Stanford HCI group has recruited these experts from online marketplaces (e.g. oDesk) or from campus and interviewed them.
Read about the Stanford HCI guarantee.

Internal Validity

Internal validity refers to the plausibility of a causal relationship between two variables.

Experimentation often relies on the logic of “all other things be equal.” Conditions/groups in a poorly designed experiment differ in more than one dimension. For example, maybe participants in two interface groups also completed different tasks.

For example, you claim to induce an emotional state by showing pictures before a usability test, but fail to check that the manipulation actually has the desired effect.

The “classic” nuisance variables in HCI are order effects. Failing to randomize or counter-balance the order of experimental conditions/groups is often a show-stopper. But then there are more sophisticated nuisance variables as well.

Selection bias happens when users are not assigned to experimental conditions/groups randomly.
Not sure about one of these? Ask a peer for help! [Learn more].

Construct Validity

Construct validity refers to the cause and effect construct that explains the causal relationship.

Papers that use a model to explain obtained data may choose one that is wrong in the eyes of the reviewer. Papers aspiring Fitts’ law modeling sometimes face this critique.

Reports of quantitative relationship between IVs and DVs are unconvincing unless accompanied by explanations based on qualitative sources, such as interviews. Arguments like this point toward favoring mixed methods research in HCI.

E.g., measuring “user preference” when aspiring to measure “user experience”. Or ignoring errors in a measure of typing speed.

For example, claiming to study the effect of “aesthetics” but having only two conditions to compare. Reviewers often point out that there are too few levels in the chosen IVs.

For example, authors want to measure user experience but collecting data several weeks or even months after use, ignoring the effects of forgetting and interference on the veridicality of user experience accounts.
Not sure about one of these? Ask a peer for help! [Learn more].

External Validity

External validity refers to the generalizability of the causal relationship across persons, settings, and times.

Do not overstate the generalizability of your findings. For example, using students but drawing implications to all healthy users.
Limited generalizability is among the most common reasons for rejection and it comes in many flavors. The criticism focuses most often on sample, tasks, user interface, and method. For example, convenience sampling (using people from own lab), contrived tasks, short duration of study, unrepresentative user interfaces/systems, or even wizard-of-oz or paper mock-ups instead of working prototypes.

For example, using blindfolded healthy adults and claiming generalizability to blind users. Authors with little training in accessibility often fantasize that a piece of technology would be useful for a particular user group but do not care to use them as test subjects or even ask them.

Experimental analogue refers to the resemblance between the set up in the lab and the target use condition “in the wild” to which the results should generalize. For example, decorating the usability lab to look like a living room with couches etc may not compel reviewers as an efficient analogue to induce home-like behavior.

Overusing one DV in generalization while downplaying others.
Not sure about one of these? Ask a peer for help! [Learn more].

Scientific Communication

Scientific communication refers to the ability of a writer to convey complex phenomena correctly. The lowest-scoring papers are almost without exception guilty of this.

For example, don’t put in big tables for descriptive or inferential statistics.

If you have more than 2 DVs and/or more than 2 IVs, presentation of results requires serious thought!

Jumping into inferential statistical analysis without descriptive statistics leaves the reader no chance of evaluating what you did.

Define and label your key variables, and use them consistently.

Why did you use that measurement instead of the other ones available?

Cramming two contributions into one paper leads to inadequate space (“the superpaper syndrome”).
Not sure about one of these? Ask a peer for help! [Learn more].


Method sections of empirical papers should be written such that the reader can replicate the study.

In the worst case we witnessed, the paper did not report even the sample size.

Don’t miss any parts. I strongly advise following the APA template for description of method, and deviating from it only with good reasons. Reviewers are familiar with this format and can easily follow it; plus, you ensure that all elements of your experiment are described.

For example, conjuring coding categories in “grounded theory” is often guilty of this.
Not sure about one of these? Ask a peer for help! [Learn more].

Applicability of Results

Let’s assume the paper was so good that reviewers found no flaws related to validity or communication. The final stretch where paper get killed is the interpretation of findings:

Some papers have findings that are predictable in light of common sense or previous work.

Sometimes the effect size is negligible, yet authors argue for real-world implications. It is recommendable to report effect sizes as APA recommends, to avoid speculation on the reader’s side.
Not sure about one of these? Ask a peer for help! [Learn more].

Writing Feedback

Ask a writing expert to give you feedback and suggestions to improve the English writing and flow of your paper.
Price: $3/page (courtesy of Stanford HCI)



Questions? Email us at:

Dr. Lauren B. Collister, is a sociolinguist and an advocate for Open Access. Her research focuses on language and identity in digital cultures. She works for the Office of Scholarly Communication and Publishing at the University of Pittsburgh.

The Stanford HCI group has recruited these experts from online marketplaces (e.g. oDesk) or from campus and interviewed them.
Read about the Stanford HCI guarantee.

Peer Feedback

Send you paper draft with a specific question. Your request should not take more than half an hour of your peer's time. For example ask them to read a 1-2 page section or help you find related literature for a specific claim.
Price: free
You can also help by giving feedback. Choose a task from the list below:



Questions? Email us at:
Read about the Stanford HCI guarantee.

List of available tasks:

1. Help me find literature on the challenges that rising entrepreneurs face, and the tools that can help them.
Title: Drawing Crowds Out of Markets and into Users’ Just-In-Time Needs