Frequently Asked Questions
I want to work with you, and I'm currently a student at the University of Colorado. How do I do that?
First, take natural language processing, machine learning, probabilistic models of human and machine intelligence, or data mining. Once you've done that, schedule a meeting, and we'll figure out a project to work on together. After we finish the project, we can discuss longer-term arrangments.
I want to work with you, and I'm not currently a student at Colorado. How do I work with you?
Then you should apply to be a student at Colorado. The best way would be to apply to the computer science and mention me specifically in your application. After you submit an application, please drop me an e-mail (put GRADAPP-20XX in the subject) with your CV letting me know you applied. I may not reply, but it's still very useful!
To help me quickly search for such e-mails (and to show that you've done your homework by reading this FAQ), please put GRADAPP-20XX in the subject line, where XX is the year you hope to enroll.
See the openings page.
Are PhD students at Colorado funded?
The University of Colorado, like all top American universities, makes a commitment to fund PhD students so long as they're making adequate progress. This includes supporting tuition, a stipend, and health insurance. Colorado, unlike the coasts, also tends to be a bit cheaper, which means your stipend goes further. This is typically through a combination of research assistant position and teaching assistant positions (my students typically TA once or twice).
Can I mention you in my statement of purpose?
You don't need to specifically ask my permission to list my name in a statement of purpose so long as our interests are a good match. If you think they are, go ahead!
Do you have any postdocs available?
I'm fairly junior, so I'm trying to fund students right now. I'm fairly good about keeping my webpage updated, so see the openings page.
Can I work with you as an intern?
Unfortunately, it's very hard to evaluate the quality of candidates without a formal system (e.g., as we have for university admissions). As a result, it is my policy only to work with people directly recommended to me by a professor or researcher with whom I already have a relationship.
Will I get admitted? Why was I rejected? What do you think of School/Progam X?
I will not answer this sort of question. Don't even bother asking. I cannot give opinions on whether you will get accepted without seeing a full application. There are numerous venues where you can get uninformed opinions about your chances. In any given year, which students I accept depends on funding amounts, match between project and students, and who says yes or no and when. It's a very stochastic process, and I wish we had a more logical system.
Can I have a fee waiver?
This is a bit of a misnomer. Fee waivers do not mean the fee is waived. It means that I pay it for a prospective student. I'm happy to do so for a very strong student, but cannot do so for many students. If this is an issue, please write in an e-mail (with GRADAPP-20XX in the subject) that you request a fee waiver. If I reply explicitly saying I will pay your application fee, then you may check the fee waiver box on your application. Otherwise, I'm very sorry, but I cannot help that many applicants with their application fees.
I do provide fee waivers to strong candidates with a disadvantaged / nontraditional background. I (roughly) judge candidate strength from a CV, but the other criteria require a personal statement that explains your background.
If it were up to me, I would make grad school admissions free (or at least based on country/region GDP), but at Colorado it is the faculty that must pay for fee waivers, so I need to be selective.
You asked me to do a virtual interview after I applied for a PhD position. What does that mean and what should I do to prepare?
First, it means that you really stood out in the pool of applicants! I typically only interview five to ten applicants a year to select the candidates I will eventually invite to attend.
In many ways, it's a sanity check. If you say in your application that you're really good at X and you want to do Y, I'll ask about those things in a little more detail to better understand your background and your skills. I'll also ask about what you want out of a PhD program.
This really is a two-way conversation, however. We're going to work with each other for N years, and we both need to be sure that we can stand each other and work well together. So it's important for candidates to ask whatever questions they're concerned about too.
Can you give me a letter of reference?
I typically only write letters for students whose committee I've served on, whom I've worked on a research project with, or who did very well in my class. Unless you are my direct advisee, you must ask me before giving my name out.
When you ask, please send a list of bulleted points that answers the following questions: how we know each other (e.g. took class X, recieved grade Y, completed project on Z), what research we have worked on (what the project was about, where it was published, your role in the project). Good rec letters contain details, and the more details you can provide that I can then surround with context, the better your letter will be.
For example, if I relied on my memory to write a letter of recommendation, I would be able to say something like "Susan took my class and did great, she did a project on music stuff". That's not as good as "Susan took my class Fall 2015, earned an A, and presented a final project on distinguishing musical styles automatically given the waveform of a song. Their group used a variety of techniques (support vector machines, convolutional neural nets, and k-nearest neighbors) to decrease the error rate of a strong baseline from 0.4 to 0.2". Obviously the second one is better, but I can't recall of the details myself. Your bullet points will help me recall details and to put your work into context.
What are your expectations / preferences in terms of what a student should know?
I personally like C++ and Python, but the culture here leans to Java, which I've been using more and more (and likely will continue to). I prefer writing tests to debugging, but debugging is a necessary evil. I do like reinventing the wheel somewhat to keep things self-contained and consistent, but I contribute the result to things like NLTK so that other people don't have to do the same. I also like using style checkers and the like to keep myself organized. (Though I say this, you can get a more honest picture of my coding style by looking at what I've actually written.)
Students who want to work with me should
- have basic knowledge of Python, C++, or Java (e.g. be able to write a dynamic program in that language),
- understand probability (Bayes rule, conditional probabilities, smoothing),
- compile LaTeX documents using BibTeX, and
- use version control software (e.g. git or svn)
These are the bare minimum requirements. If you do not meet these requirements, please take some classes to acquire these skills (preferably mine!) before asking to collaborate on research.
You should already code in some language pretty well, and conforming to my coding style will increase the probability that I'll be more hands-on in helping you code and debug, but if you want to program in LISP or Prolog, that's perfectly fine too, as long as it works for you.
Being comfortable with probability is probably the more important requirement. You'll likely have to deal with messy probability distributions, take expectations, derive conditional distributions given a joint distribution, implement dynamic programing to sample from PCFG grammars, do Taylor approximations, do some optimizations, etc. This shouldn't be taken as a laundry list of things you should know (it's great if you do) but just as a heads up of the kinds of things you might run into; part of a graduate education (life, for that matter) is learning new stuff. There will be many opportunities to learn: from classes (at Colorado: NLP [Martin/Palmer]; Machine Learning [Me]; and Deep Learning [Mozer]), your peers, and reading group.
I think attending (and contributing to) a reading group or two is critical for learning about a field and being a good scholar; it's fun and not a chore at all, but I want to be up front in saying that any student of mine should be an active participant (i.e., don't just show up; you need to present paper and be involved in the discussion of every paper. If you didn't understand a paper, ask smart questions until you do. If you did understand a paper well, answer other people's questions.) in a reading group or two.
Reading groups are also important for being able to "look smart" when you're interviewing. You'll need to be able to connect your work to what other people do. A reading group lets you know how your thesis connects to other research topics and talk intelligently about them. Unfortunately, this can't be done quickly; it requires dedication over many years to learn about the breadth of research that folks explore. So while you might feel like skipping reading group once is a good decision to get more work done, it's ultimately a bad decision because you need to consistently go to understand a broad range of topics.
How do you interact with students?
I like to have a group meeting every other week with students I'm working with (broadly construed), and one-on-one meetings as needed with students. I use Google calendar to set up my appointments, so students can grab a meeting whenever they need to. I expect students working with me full time to meet with me on average once a week (sometimes much more, such as before a paper deadline, and sometimes less). I use this online system so that my meetings are contiguous and that students always know when I'm available (and I can change things without e-mail). Students should sign up for a meeting at least 24 hours in advance. It's okay to schedule meetings outside of that time, but that should be the exception (I try to maximize the amount of contiguous time I have to research, write, and think).
In addition, everyone in my group (me included) sends a weekly e-mail to everybody saying:
- What they worked on that week
- What they plan to work on next week
- Anything that's holding them up or blocking their progress
Anyone who is working for me full time or who is my direct advisee must send me such an e-mail (with the subject [Weekly Snippet YYYY MM DD]) sometime on Friday Mountain time. I find that this is very helpful because I sometimes ask myself (or have funding agencies ask me) what I (and my students) did in a particular time period. These e-mails really help me figure that out without bugging other people. It also helps me stay productive by setting realistic goals; I use this weekly todo list to populate my daily todo list.
So what makes a good goal? You should have "Big Picture Goals" that carry over from week to week; these are often at the level of something you want to make happen this year or semester. Every week you should do something that brings you closer to achieving those big goals. Within a week, your goals should be smart. Don't have vague goals like "write code" or "continue reading". It should be obvious whether you suceeded or not in your goal (specific and measureable), it should fit in with the big picture (relevant), it should be doable in a week (time-bound and attainable).
Outside of that, I prefer face-to-face communication (when I'm not sitting down at my computer being productive) or e-mail as a communication mechanism. Instant messages are also sometimes okay for quick questions, but never send an e-mail and then ask via IM "Did you see my e-mail?"
I need you to do something (look over a draft, send an e-mail, etc.). How should I best make sure that happens?
The most important thing is to make sure it's on my radar. If you have an important deadline, make sure it appears in your snippet that you send me weekly. I will make sure I budget my time to ensure that it gets taken care of. Give me as much warning as possible. I get grumpy if I have to rearrange my schedule for you at the last minute.
It's fine (and helpful) for you to remind me. However, I'd like to make the following caveats. Unless the deadline is hours away, the best way is over e-mail; not phone or IM. It's less intrusive and I have systems for dealing with tasks that arrive over e-mail. The frequency of the reminder is also important. No more than once every five days, I would suggest.
Finally, make it as easy as possible for me to do what you need me to do. Have your reminder e-mail reference all of the material I need to do the task. If I'm reviewing a paper, remind me where in the repository it lives and send me a compiled PDF. If I need to write a letter, provide the background material and the contact information in one place.
How important are classes once I'm a PhD student?
One very frequent problem I see is that young first year PhD students want to do very well in their classes and think of research as a hobby.
For RAs, it is very much a job. Your professor has secured funding for PhD students to do research and to produce results. If you fail to produce, it makes the professor look bad to his funders, and the professor will not want to pay you to do research in the future (i.e., like a job, you can get fired).
Grades are not important whatsoever, so long as you're not getting kicked out of the program. You should use classes to become a better researcher, but if you're chasing after an A when a B would suffice and your research suffers, that's detrimental to yourself, your professor, and to science.
If you're not an RA (on fellowship or TA), then doing research is often a tryout for an RA. Unless you're 100% sure you'll have fellowship funding your entire time as a PhD student, you should make sure your professor would take you on as an RA in a heartbeat if needed.
How often should I be publishing?
You should always have an idea that you're actively working on for a paper. Publishing between 1-2 papers a year is a good average (however, this does not mean that you'll always have a publication every year). Under normal circumstances, I expect students have one publication at least submitted before their comprehensive exam, two by their proposal, and three by their defense (it's of course fine to have more, but don't prioritize quantity over quality).
I'm submitting a paper we talked about, can I add you as an author?
I should not be surprised by a paper. If I'm going to be an author, I want to: 1) see a draft with the "big picture" at least two weeks before the deadline 2) see a nearly complete draft at least a week before the deadline. (I reserve the right to still say no to papers even if you follow these rules, e.g., if I'm on vacation.)
For students working directly with me in my group, this is less of an issue, I know what's going on and can judge what's going on and whether we can submit (a collaborative discussion). But for students who come to me to discuss an idea, dissapear for two months, and then suddenly appear and want me to be a coauthor, this can be pretty annoying. My likely response is "no", I will not be a coauthor, and I will not contribute to the paper. If you wait until the last minute, the paper likely won't be any good, and I have other papers with authors who were responsible and played by the rules.
You can still choose to submit, but do not list me as an author.
Academia and Research
Is topic modeling dead? Should we all be doing deep learning?
Deep learning should be part of any modern researcher's toolkit. However, I do not think that this means that we should completely abandon topic models. Topic models are still very useful for use cases where interpretability is important. You'll still see many researchers in digital humanities using topic models, for instance, because they care about telling a good story and understanding their data.
As topic models become more of a utility, I think we'll see less of the "topic model of the week" that we saw 2005-2010. I think the important questions are how to incorporate topic models into real-world workflows and measuring whether topic models help users with those tasks.
I still hope to publish some topic modeling papers in this area.
I'm trying to use your code, but I'm having trouble. How should I get help?
E-mail all of the people who worked on the paper associated with the code with
- a minimal (simple as possible) example that can replicate your problem;
- the inputs that replicate your problem (again, this should be as simple as possible; sending multi-megabyte files is usually not minimal);
- exactly what you did (the exact command line used);
- what you expected to see;
- what you got instead (include error messages and any output); and
- what versions of various resources you're using (NLTK, Java, gcc, boost, protocol buffers, etc.).
This information is necessary for us to help you with your problem. The simpler it is to replicate your problem, the faster you will get a response. More complicated setup take longer for us to try out and debug. If your example is simple enough, we can often see the problem ourselves without running code.
Each e-mail should be self-contained. All the information to reproduce the bug should be in one place. This helps us quickly reproduce the bug, and it also ensures that you've not tweaked anything that might prevent us from isolating the issue.
What's up with your name? Why is it hyphenated? What should I call you?
My parents' last names are Boyd and Graber. When I was born they hyphenated (why people whose nicknames were "Toni the Body" and "Little Grabber" would do so is beyond me; my nickname is obvious). As a result, I am deeply, personally, against hyphenating names. Don't do it. It's not a sustainable practice, and it leads to all sorts of problems. People think my last name is just "Boyd" or "Graber", web forms don't think I have a valid name, and there's only about a forty percent chance someone will get my name right after one telling.
Most people call me Jordan, which is just fine by me. I also answer to JBG.
I'm a TA or grader for one of your courses; what do I need to know?
- First, make sure that we have a meeting before the semester starts.
- Attend at least a class or two to get a feel of what's going on.
- As each assignment is posted, look it over to make sure I haven't done anything stupid (e.g., a confusing problem); it will make your life easier.
- Once assignments arrive, create an ontology of all of the mistakes that people have made (do this before you start "grading"); this will allow you to fairly and consistently deduct points.
- Using that ontology, create a template that you can use to provide feedback to students (e.g. by copy/paste or deleting). This allows you to explain each mistake in detail without having to retype the same thing over and over again. It also ensures that you give consistent feedback for each mistake people make.
- Post a synopsis of the mistakes that people made and how to correct them.
- Never give a grade without explaining why people got the grade they did.
Why did you leave the University of Maryland?
This was a very difficult decision. The biggest factor was to be near family (I was born in Colorado, and that's where most of my family is).
I'm very glad that I had the opportunity to be a part of Maryland's research environment, and nothing about my move should be read as a commentary on the University of Maryland (whom I adore and continue to collaborate with).
What's your Erdös number?