Malvika Sharan: All right.
Welcome everyone. This is our another cohort call. Today we
have four great speakers who are going to talk about open source
development, and Open Science in general. So this is our first
module of three Open Science modules. Some of the logistics,
again, please describe by writing your name, starting with
S or W if you prefer being in a spoken or written breakout room,
please make a choice for one to make it easy for us to assign to
one room, we have an Otter live tra
nscription going, so you
would be able to click on the link on the top of your screen
and watch it work - it's wonder. and other things to just take a
few notes on that we have a code of conduct that applies to this
call. So if at all, anything that you would like to report,
please email to team@openlifesci.org. You can
also reach out to Berenice Emmy or I separately if you would
prefer to reach out to a single person. So there's a huge just like
announcement today, we have been running a poll f
or naming your
cohorts. Our first court was called Open Sesame (Seeds). O
r second cohort was called Mas ed cohort. And now we have P
rseverance cohort. So this is your name, you will be known
forever as Perseverance o Open Life Science. So th
nk you for polling. With that, will hand it over to Berenice
o take you through what we' Bérénice Batut: So pretty so
thanks Malvika Sharan for e going to c starting the call. So today we
will discuss we will it be will be our first call abut Open
Scienc
e and we will discuss most the project management skills
during development stage. So there will be many. So there
will be three calls on Open Science. So this call we've
mostly focused on open source software, open hardware software
lab, sorry, open source hardware, open data. But it's
one yeah, it's the first goal. There will be more aspects
covered during the next calls. I think there is a it's Yeah,
there are two calls that will happen the next the next week or
two. Sorry. So the first call
will be about the first speakers
will talk. So Renato will speak about agile and iterative
project management methods. So I will hand it to Renato. Renato,
you're here. Do you want to share? Malvika Sharan: Yep, I'm here.
Let's see if my internet collaborates. Can you give me permission?
Yeah. Can you track it? Unknown: Give it a try. Okay, Malvika Sharan: can I get some
Thumbs up if you can see this? Bérénice Batut: Perfect. So you
have now run 10 minutes presentation, and then you have
some di
scussion questions. So feel free for the participant to
drop your questions in the document. Okay, thanks. Malvika Sharan: That's all
right. So Hi, everyone. Um, so first of all, let me just open
this by saying that I am not an Agile expert. These are just
some techniques that I that I tend to use, often on my day to
day. And, and we will explain a little bit what this is and how
you can benefit from using this. So first of all, what is Agile.
so Agile was a bit of a principle or a framework to
organise and structure projects, it's, let's let's, let's call it
a project manager, or project project management technique. If
if you're one of these people that tend to have a lot of
post-its around with tasks to do and so on, you're probably
already even using some of the techniques that the Agile
movement inspires. But to perhaps keep it a bit more
focused on the point. So Agile is something that started in
industry, it was mostly driven by an intention to deliver a
good product to to the c
ustomers, but at the same time
to deliver that product as quickly as possible, even if in
a prototype state. But at the same time allowing to again, as
quick as possible, adapt and modify the product to better fit
the needs of the client. And so as you can kind of, you can kind
of see from from just the Agile software development manifest
and this was, as I said, was something that started in
industry but specifically in the software world. And, and the
kind of like striking point. And the reaso
n why it's called Agile
is is in part because of this. Want. I don't know if you can
see my mouse cursor Malvika? Can you maybe give me a thumbs up?
No? Okay, then let's see if I can do Unknown: this. Malvika Sharan: Can you see it
not? Yeah. So, so what I was pointing to is responding to
change, this was one of the aspects that kind of motivated
to give this name to, to the movement or to the framework,
and Agile in the sense of being very fast and kind of responsive
to, to requests and changes
. And so you might, if you, if you
look a little bit into what these, this framework consists
of, or if you compare it with other kind of approaches that
people use in the field, you may encounter these these words,
Waterfall. And you might find also, other names for for this
kind of iterative process. There's many, many different
ways that you can structure this. And I'll explain a few in
in a second. But if you just look at the waterfall kind of
concept, you can, you can see that it it falls f
rom one stage
to the next, the key point here is that all of the project plan
is kind of defined from the beginning. And you just follow
along in a sequence in a sequential fashion. So you can
imagine that, if at any point during design, or at any point
during the implementation, or one of the later stages, there's
something that you need to change about the plan, this
process is rather rigid and doesn't really allow this. And
so the iterative process is a bit more generic or a bit more
responsi
ve, it typically starts by breaking a larger problem
into smaller chunks, so that they become more actionable. And
exactly what small means. This is what differs a lot between
different paradigms, but it can be something that lasts a day,
it can be something that lasts a few hours. And then you you do
these kinds of sprints, or you aggregate these tasks into
milestones. And, and once each milestone is completed, then you
reach a stage where where you have like a first release or a
prototype of t
he thing that you're trying to accomplish. In
the case of your OLS project, it can be a milestone, for
instance, you have a website to build. And you can think how to
break the that big task into smaller chunks. And it could be
the the first article that you write could be the an example of
a first deliverable. And so to sum it up, it's a technique
primarily for software development, but we can use it
elsewhere. And we can go a little bit more into detail of
how to actually do that. It's it's gr
eat for product project
management, it helps you visualise the work that you
still have to do. And also to invite others to join the
project as well, because everything is very visible in
terms of what needs to be done and what has been done. And then
there's lots of that there's lots of variations in terms of
how this is structured. And also in terms of advantages over the
traditional waterfall method. So So I mentioned breaking down
things into tasks and milestones, and so on. And I
mentioned
as well that you could aim for slices of one to two
hours, ideally not more than a day or two. The reason for this
fragmentation is because you want to have a good a good sense
of, of progress. It's very often the case that you estimate a
task to be one or two hours. And then it turns out that you spend
an entire afternoon or something because we get distracted
because there's other things that we didn't really think. And
the idea here is that the Agile movement will help you structure
those thi
ngs that are outside of the tasks that you're doing. And
they just become tasks again, that will get picked up later
and move to a milestone that you will complete at a later stage.
And so to give you a more concrete example, or a real life
example, if if you have already explored a little bit of GitHub,
and if you've perhaps browsed some of the existing projects
there, you might see project like InterMine, where in this in
this diagram, each of these grey boxes is sorry for the
background nice.
Each of these back each of these grey boxes is
a version release. And the tasks themselves are within each of
these grey boxes. And in this case, milestones, there are
several them, you can see already some estimate of when
these would be achieved. And, and, and also a very colourful
interface for, for how the how to label things and how to
structure not just in terms of milestones, but also in terms of
what these tasks are, are all about. And in a in a slightly
different way or a more kind of
so one of the cons, one
of the paradigms in Agile or a more popular one, if you've used
the Kanban style board where instead of having the versions,
like are the milestones, as we mentioned before, you have just
the notion of what what is to be done, what is in progress and
what is already completed. So this is like a simplified
version is not so focused on on the software or versions or
specific milestones or goals. But it's more to capture what is
actually being active worked on. And and you c
ould do this
process within a milestone. So all of the tasks that you now
see in the screen could be within one milestone alone. This
also to say that for GitHub, there's some simplifications and
some automation that you can do. And you we can we can talk about
that later if you're interested. And so just to wrap up some
examples, you could have a task that is just to acquire like in
the website context that I mentioned before, one, one big
task could be to acquire a domain for that website. And
then you can see an example of how to break that task down into
smaller tasks. Or if you have a specific section of the website
that you want to create. And then you can see that for that
there's perhaps more tasks that need to be completed. And so
ideally, you would break it down again, into into subsequently
smaller steps. I will skip this for for the sake of time. And I
didn't go so much into the jargon that is involved, I
mentioned that there's different sub frameworks, Scrum, Kanban,
and e
xtreme programming are some examples of these. They all
follow the Agile principles, what changes between them is
sometimes how big these tasks are, how big the milestones are,
how often you you kind of do the larger loop or the smaller loop
for how long you do these things called sprints, which is kind of
a way of getting the entire group working collectively in a
set of tasks. And, and then just how you structure but so without
going too much into detail on that. It's just different ways
of ha
ndling the workload and prioritising the tasks that you
have to do. And, and then to finalise, even though, as I
tried to be very superficial here as well, to kind of give a
high level introduction to this to this topic. And if you're if
you're doing software development, you will find that
all of this probably translates a lot more versions, milestones
and so on. This kind of makes sense. But I, I use this
personally for it for my own day to day, just to manage tasks
that I have to do things li
ke reporting or planning meetings,
or anything, anything of this sort. And, and I find that it
works rather well. Okay, so I think I'm a little bit over
time, but I think I will, I will close it there. Bérénice Batut: Thank you for a
presentation. Okay, thank you. Does anyone have any question
for Renato, so feel free to ask them in the chat or put them in
the document? Currently, there is no question there. Anyone
wants to ask anything? You have any question? So I could get
one. Malvika Sharan:
Yeah. My My
question is, is specifically we're not though How can this be
applied on a day to day basis? So it sounds like that you
always have to do it for a big project. But do you have like
tips and tricks for day to day agile development. So I don't, I don't have the
screenshot that I could easily show with all the tasks that I
kind of have. But what I do is I try to keep it very colourful,
which also keeps me motivated. I try to split up tasks by not
just milestones, but for instance, what
what kind of
topic they are, if I have an a meeting that I want to plan,
then I will create a milestone that is that meeting and this
could include, for instance, inviting all the speakers or
getting in contact with many people. And then you could also
have, for instance, a report that you need to write and that
report could be also a milestone or a task depending how you
break it down. And then within that report You have, let's say
bullet points of things that you want to write, or that you w
ant
to make sure that they are included. And in terms of the of
the day to day process, you can. So when GitHub, you can also
have these little checkboxes that you can click within a
task. So you can do an outline of lots of bullet points. And
and, and you just go when you start clicking as you as you
progress through them, at the end of the day, you can then
review what you achieved and sort of plan for the next day,
what you want to pick on again, and it can be the same task, but
just a differ
ent point in the same task or, or it can be a
different task, depending on on how much time is left. Bérénice Batut: So are you using
just Github for that? Or do you have another tool that you
recommend? Malvika Sharan: So I've used
Trello in the past, and Trello is very, actually the GitHub
interface that was in the screenshots was very much
inspired in the Trello interface as well. But But nowadays, I
mostly use not GitHub, but Git lab, because we have a git lab
instance, running locally. And,
and it also has not just kind of
an issue, list, but also these boards that you can kind of move
around and you can create, you can make those colourful labels
on on the issues, you can turn them into elsewhere, some sort
of milestone or category like this to do when pending and done
and so on. But, but I find that this is working well, you can
also have notifications, so you can set deadlines on tasks and
so on. And it sends you reminders when those milestones
get get close. So it Yeah, yeah,
I find it very practical, the
only discipline that you need to have is to visit it as often as
you can, because there is a little bit of maintenance and
keeping it up to up and shiny. So to say if you if you don't go
there too often, then you can imagine that some of the tests
become obsolete. And some of the things also will need to you
have to be modified or updated if things changed. Bérénice Batut: Thank you. Other
questions? So do you want to verbalise your question? Or do
you want me to re
ad it right now? Emmy Tsang: I can do it. Unknown: Thanks a lot for Renato
for this presentation is very useful. Malvika Sharan: My question was
beyond personal use, have you do you have some tips or an
experience making this part of the community culture in a
group, for example, I found that when you have heterogeneous
group also, it can be complicated sometimes to
convince people to use these. So I don't know how much how
much I can reveal there. But in in my previous group, we had
that that d
ilemma at some point, a lot of the communication and a
lot of the core value within the group was was being sort of
organised by email. And this had the disadvantage that there was
a lot of work being done that was not really visible. And, and
at some point, we decided to change this, it was a bit of a
group effort as well. And, and we, we did exactly what I
described. So we moved to Git lab, we started using the Git
lab issues, you could do the same with GitHub as well. And
it, it allowed not j
ust everyone to see what was going on, it
also allowed a lot of reduction in terms of duplication of work.
And, and it, it also opened up the space to to have some kind
of automation for some things. So we could have, for instance,
if you wanted to do a specific task that that the group already
had as a core pipeline, or some software analysis or something
like that, then then you could just create an issue. And there
will be one person specialised on that that will take care of
handling that ta
sk. So it also has this this dynamic of
potentially, you could. So what I'm trying to say here is that
you can assign people to categories of tasks or topics.
And then people take responsibility for those. But it
doesn't need to be kind of like a common agreement that that
everyone will will share a little bit of the load and in
terms of maintenance. Once the once you have a core of people
kind of coordinating everything the same way. And this is one of
the things that you need to agree upon wit
h with all with
your community, then then it runs rather smoothly from my
experience. Bérénice Batut: Thanks. Thanks a
lot. So thanks, again for historic and for your question.
So now we move to the next speaker. So the next speaker is
Helena, she will talk about open source software. Helena, you
have 10 minutes presentation and after questions. Thanks. You are muted. Helena: I am muted, okay,
everyone can see it. Okay. Okay. Perfect. So my name is laner
Asha, I've been working in open source fo
r quite some time. Now
I'm currently employed by the Erasmus medical centre and the
Avans Hogeschool in Breda. And I'm here to tell you a little
bit about open source in research. So first, what is free
and open source software, the term is a bit important. Because
open source software and free software are not necessarily the
same things - free software doesn't mean open. So you can
many of us remember freeware on the internet from when we were
younger. This is software where you have no access
to source
code, and you can't work with it, you can't modify it due to
the licence. This is not great for science. On the other hand,
open source doesn't mean free. So for those of you who are
producing research software, just because you're making your
software open source, it doesn't mean it has to be free software.
And this together, the intersection of these two is
often known as free and open source software, or FLOS -
free/libre Open source software. And this just refers to all
software t
hat is both open and free. Open source is simply
licencing, your work so it can be used how you want. Making
your software open source is just a matter of setting out the
terms of how you want your software to be used, it doesn't
mean that you have to give up control of your software, it
doesn't mean that you're putting it out to the community forever.
It just means you're choosing and saying for yourself setting
boundaries on how you want your software to be used. So it's
very important step al
l the time. For those who want your
open source also makes it easy for others to remix your
software to reuse it to build upon your software, add new
features if they want. And if you're building a project where
community is important, where you want your software to be
used by a lot of people, making it open source in such a way
that people can reuse and work with you can be a really great
boost for your software just in terms of visibility and people
who want to use it, things like that. So wh
y use and promote
open source. Open source is a part of the ethics. All these are options. And if
you want people to be able to use them, you need to licence
it. One of the common fears I hear about a lot is, if I
publish my software online, if I do it in the open, it'll be I'll
be scooped someone will take my software and claim it's theirs.
But this isn't necessarily true. If someone steals your software,
there is at least a traceable log in GitHub, or which we'll
get to in a minute. There is a
copy of your software already
online. And if you already have a community around that, it'll
be very obvious that someone took it. And if you're still
worried, I've heard some people saying that they published
preprints, as a way to document that, hey, they were first to
write the software, and to really make sure they stake
their claim on that software. So don't worry about that just work
in the open, it's better for the community, it's better for the
world. And it's good for science, publishi
ng, sharing
open source code. One of the easiest and most effective ways
to do this is version control. I'm sure you're all learning
about Git and GitHub if you haven't already. But version
control is fantastic way to publish and share code with
others. It gives you a whole timeline of your software, and
it makes it easy to reuse, contribute and make
modifications to your software. So why? collaborating is easy.
One of the common things is reverting accidents. If you make
some bad changes in you
r code, or one of your collaborators
does, you can always revert, you can always go back to before
then, it makes it easy to integrate the changes from
multiple developers, like with Galaxy with some 200
contributors to the code base or the training materials as well.
And all of us can work together collaboratively because we use
version control. Also offsite copies of your software,
everyone has computer issues, everyone loses a hard drive at
some point or gets their harddrive encrypted by some
hackers, things like this. If you have all of your work in the
open in public, then you can just download a new copy again
and start working in. Git and GitHub are very common, a very
common choice. Git is one of the most common version control
systems, there are others. GitHub, likewise, is one of the
most common Git hosts. But there are others depends on what you
want to use. One of the nice things you get with GitHub is a
large existing user base and large community of people who
will be abl
e to contribute to your software to low barrier for
entry. If you need to learn more about Git, there's a great set
of lessons from the software carpentries. But one of the
important notes is that Git is very, very complex. My partner
teaches a get together session where they teach how to use Git
to the colleagues in our office. And there's just so much to
learn about Git, but you don't need to learn all of it Now.
Just start with the important parts, the rest comes later.
There, you'll see a lo
t of guides online that will say oh,
you need to learn about how the commit craft works and things
like this. But if you just want to publish your code, you don't
need any of the fancy stuff. So a few steps to make your work
open source, A Readme is a very important part of this. If you
want people to know what your software is and how to start
using it, that's the number one thing people will see. So be
sure to include lots of good images there. Licence, if it
has, if it's going to be open sour
ce, it needs a licence file.
So it takes two minutes to start a licence with GitHub, it's
really easy, just doing the same thing. Contributing guide,
GitHub has some template contributing guides that make it
really easy to tell people how you want them to contribute to
your to your repositories. Having a public roadmap, we
cannot to talk about the Kanban boards and the agile. Having a
public roadmap is a great way to tell community, what you're
working on what features you're going to implement
things like
that, that can help people get excited for your software.
Publishing list of issues same, it feels bad to say, hey, my
software has bugs. But at least putting them out in the open,
you can track the things you've done or not done. A code of
conduct, as I'm sure you've learned from OLS is very
important thing. Contact and citation can be very useful as
well. If you have a GitHub repository, you can easily get
those from Zenodo and figshare. If you need a.. there's a nice
website for h
ow to choose a licence. There are lots of
different licence choices. And they give you a lot of different
freedom to choose what you want your software to be able to do
or what you want other people to be able to do with your
software. Some people don't want businesses to use their software
for free, and there are licences that support this or some people
want everyone to use it for free. Lots of different choices.
But the ultimate goal, of course, is full reproducibility.
And we're getting a lo
t closer with things like Jupiter and
binder where you can publish your software but also a
notebook where people can run your software online, which is a
fantastic way to get people to use your software. Taking it further, there's a lot
of ways for you to contribute if you're a first time contributor
and to get involved in the Open source software movement and
contributing to the open source communities. So there are lots
of nice links here if you want to explore them. And lastly is
the Turing
way has a nice handbook on reproducible data
science and making software open source and publishing and making
it accessible to people. So with that, thank you. I think I'm 30
seconds over time. And let me know if you have any questions. Bérénice Batut: Thanks, a great
talk. And anyone has any questions within a while about
open source software? Feel free to so I don't see any Oops,
sorry. Okay. One event that So one question for Elena, if you
are new newcomers to open source which events you re
ally
recommend to join to get familiar? Helena: Definitely,
Hacktoberfest but that's just because you get a free shirt, if
you contribute for for request, which is always a good
motivator. That's what I contribute to every year. I have
not participated in the others like, what was listed there. The
Mozilla global sprint or the 24 pull requests, so I can't speak
for those. Maybe you know about Mozilla. Bérénice Batut: So but then is
it still running the Mozilla global sprint? I don't think so.
It
was for a few years, but I didn't hear a knock recently
about that. But it was a nice event, global event. But
Hacktoberfest definitely, I liked it. Especially because
people on GitHub, usually labelling their their project
with an issue that easily that are for newcomers. So it's a
nice, Helena: really nice thing that
people do easy, low hanging fruit issues that people can
contribute to. Bérénice Batut: And there is a
question, can you share some good tips for open source
software maintenance
? How can we get Sorry? Oh, can we get more
people learning and contributing to my project? Helena: This is a broad
question. Um, maintenance is always a long term topic, and
depends on your funding structure, of course. But if you
can build a community around your project, a lot of that is
made easier by doing these things like the code of conduct
and the contributing guide. These tell users how they are
newcomers how they can easily contribute to software they
give, you know, a point by point
guide, the training materials is
a good example here, we say do you want to contribute a
training material, then here are the steps you need to follow
here are how to set up your environment to contribute. Here
are the contributing guidelines. So we need your you know
information, we needed to be spell checked things like this,
check for the build errors, things that make or that someone
can read and say, Okay, I know exactly what I need to do to
make a really good first time pull request. That'
s one of the
best things that you can do for making your software. Or to
start building the community, which will, in the long term,
hopefully help people maintain and contribute to your project.
Learning about your project, this is going to be a matter of
having good documentation. And there are a lot of different
types of documentation. There's developer documentation. There
is things like the API documentation, if you write
software, what parameters for this function call, then there's
docume
ntation for like tutorials, which are really a
different audience completely, but also a necessary part of
onboarding people, there are different steps sort of like,
you want people to know how to use your software in general. So
they need sort of training materials to get them started
with how to instal how to run, what are the commands blah,
blah, blah. But then you need also the next level for once
people get deeper, you need the the developer documentation,
here's how you can contribute here
is the structure of my
software, here are the different components that you might be
wanting to work with. And then you also need the last one with
the API documentation of here are the exact parameters you can
call if you want to integrate with this software, that sort of
thing. I hope that covers it. There's a framework or good
graphic that I found a long time ago that I really liked it.
There's just different types of documentation and they're all
important. Bérénice Batut: Just not afraid,
everybody, you don't need to have everything from scratch
there. You can build that with your community and boarding the
people to help you writing this documentation. It's on aspect
there where you can get your community behind that because
it's can be a bit frightening when you when you think what
everything you need to do, just to be sure. Helena: Yeah, very good point.
Thank you for emphasising that it's really you can build it up
over time. It doesn't need to be from day one. Bérénice Batut
: And mentor
aspect. So empowering the people so mentoring them through your
your programme that they form new people that can become more
and more. Yeah, I think it's also there. So I'm answering
your question. So what is the next question? So how can I
convert a non open source repository to open source? It's
the last question then after we move on. Okay. Helena: The first step of this
is usually to make sure you remove any secrets from the
repositories history. A lot of times when people have
closed
repositories, they hard code, things like database passwords,
make sure you strip all of those out of the history. And then
once you've done that, you can work you can just make it open
and add a licence. Adding licence is really the only thing
you need to do to make it open source software and publishing
it on somewhere like GitHub. After that everything else is
extra, its decorations on the top. It's nice decoration, but
it's not necessary. Bérénice Batut: Thanks for the
quick answers.
And then I hand in to Emmy, you know, thank you.
Thanks. Emmy Tsang: Thank you so much.
Very nice. And then next, we have Esther. Hi. Oh, hope you can hear me.
Perfect. Thanks. I hope you also see my slides. Perfect,
wonderful. So hi Perseverance, I am very excited to be talking to
you today about open data, which is one of my favourite topics.
So my name is Esther Plomp. I'm a data steward at the Delft
University of Technology in the Netherlands. And I'm also one of
the mentors of this cohort.
And you can find the link of this
presentation in the slide itself. And I should have also
pasted the link in the notes. So do let me know if you cannot
access it. So open data. What is open data? Open Data is any data
that's made freely available for use and reuse by anyone and
everyone. So what does that mean? It means that everyone
should have access to the data. So it should be available on the
internet, for anyone to access on demand. Everyone must be able
to use, reuse, and redistribute t
hat data. So participation is
also very important here. It should also be transparent what
kind of information or what kind of open data you're accessing.
So there should be some information about data
generation, and collection and about the data. In terms of
reuse and distribution, that's very important for open data,
you should be able to redistribute data without a lot
of restrictions. And we'll get back to that later. And open
data should also be interoperable. And that means
that you can i
ntegrate it or link it to other data. And if
you do that, in a machine readable format, it's easy to do
that automatically. So these definitions are from the Open
Data handbook and the World Bank and I pasted those things in the
slides. So do have a look at those websites. If you want to
learn more about the terminology. What is open data
and not? or? Yeah, just to highlight that the sentence
"data will be available upon request" is not a good sign in
terms of open data. So I'm sure we've all se
en this sentence in
a publication where the author states that sure if you email
me, or if you call me, I will provide you the data. So it does
not open because you can't actually access it by yourself,
you would have to go through the author. And a study by Vines and
colleagues has indicated that is not as easy as it sounds, you
can email researchers. But because we tend to hop
institutes, we change our email addresses quite frequently. So a
lot of the time, it's going to be difficult to contac
t a
researcher. So their conclusion in their paper was then also
that research data cannot be reliably preserved by individual
researchers, which might sound a bit harsh. But to be fair, I
don't think we should be asking individual researchers to
preserve data for the long term. That's why we have data
repositories. What is also not open data is FAIR Data. So this
is a term that's been used increasingly in the data world,
but it's actually not the same as having data that is open. So
FAIR, means
findable, accessible, interoperable and
reusable. What does that practically mean? I put some
links into the slide where you can look it up in detail but I
am going to do it very briefly right now. So findable means
that your data is findable on a data repository with some
metadata. So information about the data, and a persistent
identifier. And this persistent identifier ensures that you do
not end up with a broken link. So that means that your data is
going to be accessible as well. So access
ible is seems kind of
like it's open. But that's not actually what's accessible in
Word FAIR is about. accessible means that there's a procedure
in place that allows you to obtain access to the data,
potentially, your access can still be declined, that there
should be a procedure in place. So this is where it's different
from open data is not necessarily open. It can be, but
not necessarily. But here also, we see that
interoperable plays an important part of FAIR and your data is
interoperable w
hen you use open formats. So formats that anyone
can use such as PDF. Instead of Word documents, for example, not
everyone has a Word licence and can just use that software. But
also, if you use commonly use vocabulary, or standards to
describe your data, that means it's more easy to integrate it
with other data and more easily understandable for others. The
same goes for reusable. If you want your data to be reusable,
you should document it well, so that others can interpret the
data. And in or
der to do that, it has already been mentioned
for software. But the same goes for data, you can set up a
Readme file, and I put put two examples of README files in the
presentation. So do please have a look if you want an example.
And also, there are things such as data dictionaries, or code
books, there's a little bit more elaborate in comparison to the
readme file. But it's also very good practice to document your
data so that others can actually reuse it. And then what is also
the same for da
ta as for codes, in order for it to be reusable,
you need a licence. And for data the most commonly used licences
are the Creative Commons licences. So we do use different
licences than software. Because data and software are different
objects, it functions a little bit different. So I would not
recommend using the Creative Commons licence for software.
But for data, this is standard practice. And the licenced use
of our software was already linked before this, I'm actually
linking to the same o
ne because it's a very nice website. But
for data, we now also have a licenced user. So you can go to
the link in the slides to check that out. So just briefly, also
introduced what a licence is. it is basically a sort of
standardised contracts that tells anyone what they can do
with your data. So they don't have to ask you what you allow
them to do with it. And as actually why it's very important
to have this licence out there in the open. Because if your
data or your software doesn't have a li
cence, it means no one
can actually use it without asking you. So this is why it's
very important to have that licence. getting distracted by
the chat. I'm not looking at that right now. Sorry. We'll get
to it later, hopefully. Right? Open Data has CC 0 or CC BY
licences. This means that there's no law of restrictions
for re-use of your data. So for example, cc zero or the public
domain allows the re-user to do basically anything with your
data. And they don't even have to attribute you. So they
don't
necessarily have to cite you. As good practice, though, so would
recommend to cite your sources. But if you would like to enforce
that a bit more, you could use the CC BY licence. So that
allows the same things as you can see in the slides as the
public domain. But it also ensures that every user should,
should cite you or credit you. And then the further down you go
with these licences, you see that they get a little bit more
restrictive. So for example, the CC BY non commercial non
deri
vative, the one at the very bottom of this list, that one
does allows you to copy and publish the data but it doesn't
allow for commercial use, and you can't actually modify and
adapt it. So this is not really open data in a sense because
there's a lot have restrictions that are placed on the data. But
in any case, if you want to choose that licence that is, of
course, up to you. And also just a highlight here. If you're not
sure what licence to choose for your project and you want to
share data
, please also feel free to contact me if you just
want to ask or just have a chat about it. And then something which I find very
important is that it's not just about the access to the data and
redistributing the data. So I pasted two of my favourite
quotes regarding Open Science on the slides, which is that there
is no Open Science if science is not open to all. And that's a
quote from the blog post "bropenscience is broken
science", and can recommend to read that one. And also the
quotes, incl
uding more ways of knowing, and understanding our
common World Within the great scientific conversation would
enrich and diversify its collective ideas and creativity,
for the common good. And that quote, is from Open Science
beyond open access, and that's a report, which I also highly
recommend that you read. So it's really not just about having
access to things. I think open data is really about anyone
allowing to participate in the creation and generation of that
data as well. Because otherwi
se, I don't think it's truly open.
And if you want to learn more about that, there's the data
equity framework by we all count. And there is a webinar on
data justice talk story, which I recommend and a book on data
feminism by Catherine d'Ignazio Ignacio, and Lauren Klein. So
they go a little bit more into this topic on why it's important
that everyone can participate. And then I also listed some
additional resources. Paula Andrea Martinez gave a similar
presentation to this one last year for
cohort two, and it's a
really good presentation. So do have a look. And if you want to
really participate in open data, and you want to get more
involved, I can recommend joining the research data
Alliance, which is a global research community that really
wants to increase awareness and facilitate sharing of data. So
this is a great way to get in touch with colleagues, and get
involved to initiatives and working groups and do some
really practical stuff, as well in terms of data sharing, of
cour
se, the Turing way, already been highlighted. But it has an
open research or reproducible research chapter. So you can
also find more information there. And the open data, open
data Handbook, by the Open Knowledge Foundation. And then
if you you're running out of time, and you don't want to read
full data, books, and presentations and all these
things, I listed two blogs- 10 arguments against Open Science
that you can win by Malvika. And I'm not necessarily saying that
you need to be an open dat
a evangelist. But if you're
hesitant about sharing your data, this is also a good blog
to read, and maybe reconsider why you're hesitant about this.
And I also wrote a short blog about how can you make research
data accessible? So I'm shamelessly plugging that. And
this blog is a bit more practical in terms of how you
can actually make your data open. So we can't really go into
that right now. Because of time restrictions. But if you have
any questions regarding that, please let me know. Yeah, I
mean, your Slack channel, I'm one of the mentors. So I think
that's the easiest way to find me. But I'm also on Twitter, or
you can send me an email. And I think you might be able to
attack me on your GitHub repos, not 100%. Sure, but happy to
answer any questions there as well. then that was it. Thank
you. I know you have to run soon. So
I'll keep it short. Folks, Please leave your questions in
the Google Doc. If we don't have time now. Shall we have one
quick one. Alexander asked. Do you have
any experience with the
Open Data Commons licences? Hope Open Data Commons licences.
Yeah, let me look at Google Doc. Sorry. Is that do you mean the
same thing as the Creative Commons licence or is this a
separate? Let me check the website. It's from the Open Life Sci. Yep, sorry. Unknown: Not me. They are they
are very, very similar, but not not Same, these licences are not
for data. They are not very useful. But I think they are
more focused on on data on open data. Emmy Tsang: To be honest,
I
haven't heard of them. So I'm happy to check that out later.
But yeah, there's multiple licences and just a Creative
Commons licence. So yeah. Thanks for sharing. Thank you. We learn, we learn
new things every single time we have this, which is great. I'm
sorry, folks, I think Esther so you don't, I'm assuming you have
to run. I can do five more minutes. So
unfortunately, I have to go to another workshop. So I can't be
in the breakout rooms. But again, please do contact me on
slack. If you wan
t to have like a second pair of eyes going over
your data or your licence or whatever, if you have more
questions, or if you're watching the recording. Okay, let's do it. Let's do as
many of this as we can before, you have to run. understand that
there isn't enough to make data open. But isn't it a requirement
that the open data are FAIR. preferrably? Yes. Because the
FAIR principles really do. Yeah, increased reuse, and they're not
really. They're not really inhibiting each other. So if
prefer
ably, yes, but the other way around, then yeah, doesn't
work so much. That answers the question. Thank
you. And let us ask, where's the best place to store your data in
the public databases to make it findable? Yeah. So there's, there's a bit
of debate, what is of course, the best place. So if you have a
disciplinary specific repository, I would recommend
you to deposit your data there, because that's the place where
your colleagues are going to look for the data. So that's the
best place to do
it there. If there's no disciplinary specific
repository, which is very highly likely the case because not
every discipline, or every subfield has its own repository,
then you can use a general repository, and one of the
examples is Zenodo, but also figshare. And there's Yeah,
there's a couple of others. In that case, I would recommend to
use a repository that really assigns these persistent
identifiers, because that makes your data more discoverable. And
also, it allows others to cite your dat
a. And it's persistently
available. So those are, I think, the two main points to
pay attention to when you're looking for a repository. And I
can share some resources in the notes about how you find a
repository. I think it's also in the blog, not 100% Sure, but
I'll share it. Thank you so much. Alright,
folks, if you have any more questions with that, thank you
so much. And since then, hopefully, we now have breakout
rooms. So going back a little bit to the first talk that we
had today on iter
ative project management and design. So the
breakout room will be 10 minutes. And just a small
reminder, before we start, if you haven't already done so
please, we really appreciate it if you can rename yourself with
a W or an S before your name. So we know where to put you Malvika Sharan: can actually Emmy Tsang: 10 minutes, there
are sort of like a sort of a two part task. So the idea is that
we would like you to break down your first milestone into
achievable chunks. So that's what you're goi
ng to do for the
first five minutes. Working silently on your own. You can
use the Google Doc to take notes as we do this. So that's from
page 10 onwards, you can see the status of your notes. So after
five minutes of breaking down your first milestone, share it
with your group, what you find interesting, and or challenging.
So if you are sharing in a spoken discussion, then you can
of course, speak freely amongst yourself. If you're sharing in a
written discussion, instead, we'd like to ask you
to please
keep your conversations written. This could be done either in the
zoom chat, nobody else would be able to see it other than the
people in the room. You can use the Zoom cat or on Google
Documents when everyone else will be able to see it. I hope
that's clear. And I think you're ready. Malvika Sharan: Yeah. So the
spoken rooms are room one, two, and three and rest of you all
are in the written room and I'm opening the room here. It'll Emmy Tsang: pop up on the screen
now. I think we're
all. So hope you
all had a good discussion and some interesting experience
breaking down your first milestone into achievable
chunks. Does anyone want to sort of verbalise some of their
thoughts on what you found interesting or challenging in
the process? Please feel free to unmute yourself. Yeah, go ahead. Unknown: Go morning, afternoon,
evening. Um, I think one of the challenges which I had on my
first milestone is really learning what I did not know. So
I think going into it, oh, GitHub, I c
an do this business.
And then it opened up like 20 other doors of more questions
and excitement and being like, oh, there's a lot more parts,
which I really need to fill in to make this a complete project.
So I think the deeper I get into these milestones, the more I
realised how much I could actually flesh them out. And
that's tough, but rewarding, I guess, more work. Emmy Tsang: Thank you so much.
Yeah, no, I feel you. If does anyone have any tips to share as
to how to make this a bit more man
ageable? Oh, it is anything
that you find challenging or interesting as well, in that
process of breaking things down? I'm checking the notes as well.
And I see that folks have mentioned, you know, it's a bit
of a chicken and egg problem that you have, you first have to
have an idea. But then, without the user's needs, you can't
really have the idea. So it's also a bit of that that
iterative process of going back and forth between the problem
and the solution and the problem. And so, yeah, it's
great that you notice that that's, that's really great to
see that. I hope you find this process, rewarding nonetheless,
despite the fact that it could feel like a little bit of a
circle at some time. But I assure you, you're moving
forward, and in many ways that you don't, you're not aware of
at this present time. But yeah. So if you have some time, yep. Unknown: I would like to make a
comment to that problem. Also of like, learning all these new
techniques, intuitive might be a great way of on
boarding new
people to your project, right? Maybe you know, somebody who's
an expert in your department that already knows GitHub, and
what seems to be like a big task for you. It's like a five minute
job for them. So you make this very low entry barrier task for
them. And you get them hooked into a project. Maybe. Emmy Tsang: That's a wonderful t
tip. Thank you, Andre. Yeah, it's also very satisfying to be
able to help someone so do pass this wonderful feeling on to
other people. Anyone else wo
uld like to say something about
about their experience. breaking things down. Not afraid of the
awkward silence. So let's do that for another 10 seconds. Okay, if not, we can. We should
go to the next section. But please keep thinking about this.
Please keep trying, please keep sharing your experience. I now
pass to Malvika. Malvika Sharan: Great, thanks so
much Emmy. And I'm very glad that Andre already spoke. So it
makes it easy for me to introduce Andre. Andre is one of
the open hardware rela
ted mentoring programme which is
very similar to OLS and we generally call them our sister
programme. He's a co founder of that and he's here to share with
you what opens up what open hardware means. to you, Andre. Unknown: Hi, everyone. Thanks
for the introduction Malvika let me just find my should have done
that before. Find my slides, which i have actually shared on
the document already. Yeah. So Hello, everybody, thanks for the
space and time to talk about open hardware, which is
something t
hat I'm really passionate about. And I guess I
think we start by taking things one step back, because we spoke
about all of these wonderful initiatives in in science,
right, we spoke about open data, and open core open source
software, which is more known to everybody. Where's my screen?
Here, I hope? Can everybody see my presentation? Okay, cool. And
so the point that I'm saying, like trying to make a step back
is that without hardware without equipment, there is no science
in a way, right? Lik
e we can download all these amazing data
sets. But if you really want to be in control, of generating
data, and like doing answer exciting questions locally, and
questions that are important for your science in your community,
let's say, then you need the proper tools to do it, right.
Malvika already introduced me, but Hello from my side, so that
I get started. I work at the University of Sussex, where I
have a job that really that I really, really like, which is
developing open source hardware
For the Department of
neurosciences. I'm also volunteering a lot of my time to
train in Africa, which is an NGO, trying to help development
in higher education in Africa. So we do a lot of sharing of our
knowledge so that people can actually build their own tools.
We do a lot of workshops, I started a small website called
Open neuroscience, and I'm working with Julieta Arancio.
And Alex Kutschera, as I mentioned, on this programme,
that it's similar to Open Life Science but for hardware called
O
pen hardware makers. All of this started about I mean, I
think already spoke about like, why open hardware, right, like
we need hardware to do the experiments, the things that we
do. And here's just one example in life sciences. Right. So if
you are thinking about microscopes, which is one of the
workhorses inside Life Sciences, right? These tools are from they
were first developed, like in the late 1600s, I think, if I'm
not mistaken, and this one that you see on the left is actually
from 19, t
he turn of the 20th century. So 1904, or something,
and the one on the right is a little bit more recent, as you
can see, like the modern microscope didn't change much
over the last 100 years. And still, if you want to get one on
the right, you would spend something like 10,000 pounds,
right? And I mentioned costs, because although we really like
to romanticise things about science in the end, like we
still need to pay for things. And so these things are really
expensive, although the technology
in them didn't change
much over the last 100 years, right. So there is something
wrong there from my perspective, but even if you don't consider
that if you think about you are let's say in the global south
where, I'm originally from trying to buy a microscope, you
only find distributors from the global north. So you have to
import things, they only have they are the most they have
their clients in the global north mine. So if I bring it
back to my hometown, in Brazil, where it was super hot, w
arm and
humid, then like this thing is gone, like rest really fast and
probably be out of work. If it breaks, I have no local support,
I have no idea what's going on inside because it's, it's a
black box. And as COVID-19 has shown us our supply chains, for
equipment for for a consumables and all like they are really
fragile, right, like once like, there is a disruption from like
China and so on, like, everybody gets stalled, and things get
delayed, and so on. And so all of this together makes
sc
ientific equipment really slow to innovate, right? Because you
cannot open like, I'm not gonna fiddle around with a 10,000
pound piece of equipment, right? Like, if I break it, like the
whole department is gonna want my head. So like, I better not
fill around with that, right? And so this is why like, without
access to these tools, proper access, there is no science,
right? And this is where open source comes in. This is just a few examples of
open source microscopes. They're available today, if
you want to
find these projects online, where you could find them
reproduce them right away. And the nice thing about them is
that all of them are under $100. Right? Some maybe Actually,
let's say all of them were under $200, which is like orders of
magnitude cheaper than the ones that I just showed you. And
they're all portable. Right so here on the left, you have the
scale view, which is about Five centimetres is something we
developed in the lab a long time ago. And we're using for
education
al purposes and so on. But you could do like 80% of the
stuff that you do in the labs, right? This one next to it is
actually very specialised for fluorescence, which is like a
fancy method, right, like in life sciences. But the one that
I would like to highlight or this to one, like, the one left
with the yellow one, which is called openflexure, which is an
amazing project that I really like, I'm not involved with the
project myself, but I really like like the project, because
they're actually
made the papers and prove that this actually is
able to detect malaria, inside blood cells. So the resolution
of this is on par with optical microscopes. And you can build
it for a fraction of the cost in 3d print parts, wherever you are
actually, a lot of the developers for design in
Tanzania are partner developers with the people here in Bath,
which is close by to Brighton Ram. And the other microscope on
the right is actually about this big, and goes on the head of
mice, and they can actuall
y do live imaging of brain activity
while the animals are performing a certain task. And what I think
is really crucial about these two examples, these last two,
the result is that the yellow one open flexure, they actually
showed that they can do better, like the same or better
performance than actual commercially available
microscopes. And the one on the right, is actually providing,
which is the miniscope project can actually provide something
that wasn't available before. So before the minis
cope, as an open
project came along, there was no way to have freely moving mice
and record their like brain activity with optical imaging.
Right. And so this brings a lot of innovation, there is a big
community around miniscope, just to like, pound on the point that
the speakers that came before me said that fostering a community
and bringing people into an open project actually is good for
research is good for you. And it's good for innovation and
like speeding up the cycles in science and res
earch. So I
wanted to keep this kind of short. So this is just a slide
that Juli made. And I adapted to show why we should use open
hardware. And I hope I have mentioned this already, even if
like too fast and, and giving you a lot of information. But
let's like take again, step by step. So because it's
reproducible. So we actually use GitHub and GitHub a lot to put
out our projects, our open hardware projects, because it's
actually super fun to learn how somebody did something like how
the mini
scope this amazing project works, right, I can go
into the documentation, and actually see how they put the
boards together, how the electronics work and so on. It's
super affordable. So from our experience with people that are
being part of our workshops in Africa, they say, look, this is
affordable enough that here in our institution, we can, like
put some resources together and get started with open source
hardware. It's repairable because I know like everything
that is going on. Inside the h
ardware, I can also know how to
repair it when something goes wrong. But not even that,
because I know how everything works. I also know if the data
that I get out of it is reliable or not. Right, like, Oh, is this
PCR? Or is this image from this microscope really what it's
supposed to be? Or is it an artefact? Right, because I know
how it works, then I know the limitations and the
capabilities. And because I know the limitations and
capabilities. I know where and how, and if I need to customise
something. Right. So let's say I'm using the open flexure for
something, but right now it works on power in the in the
wall socket. Right, but I needed with batteries, because I know
like how it works, I can easily say, Okay, if I change this
power supply with these batteries, then I can actually
bring this to the field and use it outside the lab space or even
in a lab where there are constant power failures. Right.
And putting all of this together. This is then
obviously, hopefully, if it's do
ne, right, it's democratising.
Right, because then it doesn't matter if you are in the global
north, or if you're doing citizen science, or if you are
inside academia, because you're collecting data with tools that
you actually know exactly what they're doing. You can go and
say, Look, this is my data. This was how it was recorded. And
it's it's, it's good data, right? So we need to discuss
data and not whether or not like I have a $10,000 piece of
microscope. Andre: Luckily for us, there is
wha
t I'm calling like the Cambrian explosion of open
source hardware going on. So what you're seeing here are a
lot of projects that are currently available. some
highlights that I think are really cool, like here on the
top left, you can see this little object on the on the palm
of her hand. This is an atomic force microscope, right? So this
goes down to image like really, really, really tiny stuff.
Right? So it really measures like nanometers. And things that
are really small. Then here, on the b
ottom row, you can see the,
the image where there is somebody holding a little
whiteboard, this is actually an ECG machine. And this person
actually discovered that, like, he writes a blog post, and I can
find the reference later if somebody wants. But he actually
finds that he has a heart arrhythmia, because he was able
to play around with this ECG and brought it to his doctor and
said, Look, this is my data. This is how my heart looks like
over a 24 24 hour period. And there is something wrong
here.
Let's fix it right. And it could have been that he would never
like know about this. But I think it's interesting that
people have the empowerment to like know what's going on with
them. But on this note, there is also a project that is called
Open insulin, where people are actually making easily monitors
and trying to make their own insulin. Because like price
surges of insulin in the us right now, for instance, a
crazy, and a lot of people are really having bad times managing
to get the
ir hands on the proper insulin that they need, and so
on. And so this also empowers people to say, you know what,
like, we are not going to take this nonsense anymore. And we're
going to do it our own way. And there are many, many other
projects, right, just a little bit of data on something that
we're working on at the moment, which is, this is something that
we're working with, just to show you that what we're seeing here
is the number like the fraction of papers, as a percentage of
total publ
ications from Pubmed, over time, from the 1990s to
2020 of papers that had either open source hardware, open
hardware, or open lab were in their abstract title, or
keywords. Right, so you can see that this is not growing. I
mean, it's still a tiny, tiny fraction of the total number of
papers, of course, but what I like here is how fast new new
papers are coming more and more each year. If people are
interested in this, I would really recommend taking a look
at the GOSH community, which is the gl
obal Open Science hardware
community, which is mainly where Juli, Alex and I met. But the
point that I like to make here is that this is a really, really
global online community most of the time, where people really
from all continents are discussing open hardware,
they're really, really open to newcomers and really willing to
help with questions. And there is I think, the most important
document that was done collectively, as you can see on
the photo, on the bottom, right, like this was the vot
ing for
things in the manifesto that we wrote, is like, what is the
point of the global Open Science hardware community and what is
our Manifesto, right. And you can see the points highlighted
here. And this tries to make sure that this is, is in still
keeps going as a really diverse and horizontal community when
things are being discussed. And all of these different
communities are being taken in consideration. Again, I can
share the link for this, if I did notice here I have a slide
in the end
with useful links, this might be there. A little
bit of a shameless advert I'm sorry, so we are actually
finishing the curriculum for a new programme. And you can pre
register. So if you go to open hardware dot space, and you can
also find more information about this. And the idea of this
programme is to take people that are newcomers into open hardware
and to show them best practices, right. So you're going to cover
a lot of things that are quite similar in terms of like, Oh,
this is GitHub or
this is Instructables. And these are all
these platforms, but also, like we're gonna show like points on
documentation and licences and things because hardware, believe
it or not, most of hardware projects need at least three
different licences, which is different from all the projects
that we've been discussing right now. But yeah, so I just wanted
to say, please take a look at the website, get in touch, if
you want, like will be super, super happy to get projects, or
even just questions from
you. Because we are in the moment
where we're finalising our curriculum and we're ready to
launch hopefully, in another couple of months, and then still
run a cohort this year. With that, I'd like to say thanks,
you can ask me questions. I think you're writing hopefully,
in the document. You can find me on Twitter, you can send me an
email as well be super happy to chat. Unknown: And just to say that
here is a list of what I think are useful links. And the
presentation is available also on the d
ocument. I hope I didn't
go over time. But this is mostly what I had to share today. Malvika Sharan: Thank you so
much, Andre. We have one question, but I'm just gonna
announce that we're top of the hour if you have to drop off
please feel free to drop off. But there is one single
assignment which is new, which is about breaking down your
milestones into smaller minds milestones. Besides that
everything is a reputation from last cohort calls That we asked
you to develop your documentation, try t
o launch
your website whenever you're whenever you can. So back to
Andre, for folks who are still sticking around. So first of
all, there's a huge, there's a beautiful comment, which says, I
love how open hardware can really unleash everyone's
creativity. And every time this talk happens in our call, people
see it for the first time. And it really, you know, it is it is
something that we don't talk about enough. So I'm really glad
that you could come and talk about it. We have one question
which
says, I'm blown away by the mini microscope, can I ask
if you came across something similar to for other hardware in
the lab, such as plate readers, or thermal cycler? For PCR? Unknown: Absolutely. So one
thing that I started doing when I got back into this path long
time ago is if you have a Google for open source, and then the
piece of equipment that you need, you most likely, and this
is really, really cool, you're going to find somebody who has
some sort of a project to do exactly what you
wanted to do
already, right. And so for instance, there is a very famous
PCR named open PCR, and these inspired other projects. And
it's really much cheaper than than regular PCR machines. What
I also wanted to say is that COVID has listed a lot of people
working on real time PCR machines that are open source
for detection, and testing, and so on. So there are a lot of
these projects now available online as well. What was the
other equipment? thermal ? Yeah. So all of these, I think they're
avai
lable. It's a matter of looking for this. And on the
links that I send, there is the open hardware Observatory, where
you have a lot of projects that are there. They're actually
Sorry, I'm going on and on, but they're actually like $5 PCR
machines available, where you do like one PCR like what would be
the equivalent of one eppendorf tube at a time. But still, like
this is $5. Right? And so if you don't need to do like 96 plates,
like 96 Well, plates, like a regular like this could be
enough for
you. Right, Malvika Sharan: so I'm going to
post the link that Andre just talked about is the project
directory where different protocols for hardware there is
I believe, yeah. Okay. Thanks so much, Andrew. That was
wonderful. And thank you everybody for being here.
Andrew, you're not yet in the slide, but I'll add you and I
hope people can reach out to you if if needed. Yep. Wonderful.
Thank you so much, everyone. Emmy and I are gonna stick
around for five more minutes. So I'm going to stop re
cording but
if you have any
Comments