by Dave Swain
5 min read

How to help PhD students write better code, and why it helps them submit their thesis on time

"An Australian Research Data Commons report recently identified that over 47% of research publications involve software development. At least this percentage and maybe more PhD students will likely be involved in developing computer code for their thesis."

Ensuring PhD students submit their thesis on time is a never-ending challenge. Investing in tools and training to help PhD students stay on track lifts research income and important organisational research outputs, such as publications.

Writing computer code is a growing requirement for completing a PhD. While research higher degree students can take courses in statistics, data analysis, or how to overcome writer's block, there are very few courses focussed on research higher degree students writing and managing good code.

Good coding practices are integral to software development companies. They know the importance of productivity, efficiency, teamwork, testing and reliability, quality assurance and delivering high-quality outputs. I would argue that all of these attributes are essential for a PhD student to successfully complete a thesis on time and to build habits they take forward in their research career.

Imagine if when a new research higher degree student started their PhD, they were provided with training and a framework before they began collecting data and writing code to generate models and analyse data. What if this framework was built around good software development practices targeting the critical elements of efficiency, productivity, teamwork and quality assurance? This might help students with their computer coding and create a broader framework that helps ensure they submit their thesis on time. What are some of the things we can do to support research students as they start to develop research computer code? Here are a few tips:

1. Create an expectation and understanding that computer code is treated as a valuable asset.

Many supervisors and research higher degree students don't appreciate how valuable computer code is. Often, students will download open-source software and use a free integrated development environment (IDE) to write and run their code locally on their desktop or laptop. The output is not the code but usually some analysis or a graph that gets exported to a word processing package to go in a paper or the thesis. As supervisors and collaborators provide feedback and ask for fresh analysis, the code gets copied, and new files are generated.

Unlike the thesis, which is valued as it evolves towards a final submission, the code is seen as a temporary asset that delivers outputs and has no long-term value.

Journals increasingly ask authors to include declarations on how and what code, such as generative AI, is used in a submission. This request recognises that the code itself is an essential component of research and is a valuable asset. Encouraging students to think of code as an asset will help them adopt good coding habits.

2. Provide training opportunities to help students develop their software coding skills.

There are plenty of courses on statistics and using programming languages such as R and Python for research (we think Joachim and his team at Statistics Globe do an excellent job!), but writing the code is only one part of the equation!

Students need to develop broader software development skills linked to areas such as:

These skills will not only help them in their academic careers, but they are also highly sought after by companies.

3. Encourage students to publish at least one paper that allows code to be included as part of the submission.

Research higher degree students need to see the benefits of better computer code management. Encouraging a student to submit a journal paper that includes the code behind the research article will demonstrate the importance and value of developing an open code framework.

Being open to scrutiny will build confidence and help students prepare their computer code as part of their thesis. Ultimately, students should provide examiners access to their code as an API, a link to a repository, or an appendix (and preferably all 3!)

4. When students present their research plan, make sure they also include details on how they will manage and develop their research code.

Developing computer code to solve complex problems is not linear and often requires code refactoring and change. Without a predefined framework and clear work practices, writing code becomes disorganised even after the best initial intentions.

When students present a research plan, they should include details on managing their computer code. What coding language do they plan to use? How will they manage version control? How will they ensure modules of code can be run independently but still be integrated as part of an overall thesis? Encourage students to use notebooks and markdown syntax to help with explanations and code reviews.

5. Encourage students to connect their computer code to a broader set of external users.

Code that forms part of a PhD is intrinsically inward-looking. In other words, it is focused on satisfying the student's thesis. This is fine, but the student misses the opportunity to use their computer coding to get feedback. This feedback can be based on the code's performance and broader value. This value could be for the broader research community and non-research users, such as businesses.

This doesn't mean the student has to share their code, although sometimes this can be helpful. However, developing packages or API connections allows users to access the insights without accessing the direct code.

The quality of the computer code is reflected in the quality of the outputs. Feedback from users can help refine the code and enhance the insights. Most importantly, this feedback allows an RHD student to deliver robust insights more efficiently in their final thesis.

In Summary
Developing good research coding practices helps research higher degree students get better outcomes. Most importantly, it provides a framework that ensures their final submission is on time or even ahead of time. We can provide training and tools to ensure RHD students get better outcomes from the computer code they generate through their PhD.
I'm Dave Swain. I have over 30 years of experience working in government research organisations and Universities. As a researcher trying to commercialise code-based insight, I know the challenges for research students and the opportunities for industry to develop and use research code.

That's why I founded TerraCipher and developed a dedicated platform called Shaipup. It provides tools for researchers, higher-degree students, commercialisation offices, and industries to leverage their research computer code. Shaipup provides a framework, and we help and support users and developers to ensure they maximise the impact of their research computer code.

Do you want us to help you deliver greater research impact?

Benefits You Can Expect

Growing Research Networks
More Publications With More Citations
Impact Metrics to Increase Funding
An API interface that connects your code
Protected but accessible code
Monthly usage statistics
Transparent outputs that build trust
URL that can be used in citations
Quick and easy commercialisation options
A profile page to showcase your skills
A marketplace to promote your research
Direct connection to global users