My Blog Posts, in Reverse Chronological Order
subscribe via RSS
This past semester, I took Convex Optimization and Approximation (EE 227C). The name of the course is slightly misleading, because it’s not clear why there should be the extra “and approximation” text in the course title. Furthermore, EE 227C is not really a continuation of EE 227B1 since 227B material is not a prerequisite. Those two classes are generally orthogonal, and I would almost recommend taking then in the reverse order (EE 227C, then EE 227B) if one of the midterm questions hadn’t depended on EE 227B material. More on that later.
Here’s the course website. The professor was Ben Recht, who amusingly enough, calls the course a different name: “Optimization for Modern Data Analysis”. That’s probably more accurate than “Convex Optimization and Approximation”, if only because “Convex Optimization” implies that researchers and practitioners are dealing with convex functions. With neural network optimization being the go-to method for machine learning today, however, the loss functions in reality are non-convex. EE 227C takes a broader view than just neural network optimization, of course, and this is reflected in the main focus of the course: descent algorithms.
Given a convex function , how can we find the that minimizes it? The first thing one should think of is the gradient descent method: where is the step size. This is the most basic of all the descent methods, and there are tons of variations of it, as well as similar algorithms and/or problem frameworks that use gradient methods. More generally, the idea behind descent methods is to iteratively update our “point of interest”, , with respect to some function, and stop once we feel close enough to the optimal point. Perhaps the “approximation” part of the course title is because we can’t usually get to the optimal point of our problem. On the other hand, in many practical cases, it’s not clear that we do want to get the absolute optimal point. In the real world, is usually a parameter of a machine learning model (often written as ) and the function to minimize is a loss function, showing how “bad” our current model is on a given training data. Minimizing the loss function perfectly usually leads to overfitting on the test data.
Here are some of the most important concepts covered in class that reflect the enormous breadth of descent methods, listed roughly in order of when we discussed them:
Line search. Use these for tuning the step size of the gradient method. There are two main ones to know: exact (but impractical) and backtracking (“stupid,” according to Stephen Boyd, but practical).
Momentum and accelerated gradients. These add in extra terms in the gradient update to preserve “momentum”, the intuition being that if we go in a direction, we’ll want to “keep the momentum going” rather than throwing away information from previous iterations, as is the case with the standard gradient method. The most well-known of these is Nesterov’s method: .
Stochastic gradients. These are when we use approximations of the gradient that match in expectation. Usually, we deal with them when our loss function is of the form , where each is a specific training data example. The gradient of is the gradient of the individual terms, but we can use a random subset each iteration and our performance is just as good and much, much faster.
Projected gradient. Use these for constrained optimization problems, where we want to find a “good” point , but we have to make sure it satisfies the constraint for some space . The easiest case is when we have component-wise linear constraints of the form . Those are easy because the projection is as follows: if exceeds the range, either decrease it to , or increase it to , depending on which case applies.
Subgradient method. This is like the gradient method, except this time we use a subgradient rather than a gradient. It is not a descent direction, so perhaps this shouldn’t be in the list. Nonetheless, the performance in practice can still be good and, theoretically, it’s not much worse than regular stochastic gradient.
Proximal point. To me, these are non-intuitive. These methods combine a gradient step with a proximal method. They also perform a projection.
Then later, we had special topics, such as Newton’s method and zero-order derivatives (a.k.a., finite differences). For the former, quadratic convergence is nice, but the method is almost useless in practice. For the latter, we can use it, but avoid if possible.
As mentioned earlier, Ben Recht was the professor for the class, and this is the second class he’s taught for me (the first being CS 281A) so by now I know his style well. I generally had an easier time with this course than CS 281A, and one reason was that we had typed-up lecture notes released beforehand, and I could read them in great detail. Each lecture’s material was contained in a 5-10 page handout with the main ideas and math details, though in class we didn’t have time to cover most proofs. The notes had a substantial amount of typos (which is understandable) so Ben offered extra credit for those who could catch typos. Since “catching typos” is one of my areas of specialty (along with “reading lecture notes before class”) I soon began highlighting and posting on Piazza all the typos I found, though perhaps I went overkill on that. Since I don’t post anonymously on Piazza, the other students in the class also probably thought it was overkill2.
The class had four homework assignments, all of which were sufficiently challenging but certainly doable. I reached out to a handful of other students in the class to work together, which helped. A fair warning: the homeworks also contain typos, so be sure to check Piazza. One of the students in class told me he didn’t know we had a Piazza until after the second homework assignment was due, and that assignment had a notable typo; the way it was originally written meant it was unsolvable.
Just to be clear: I’m not here to criticize Ben for the typos. I think it’s actually a good thing, because he has to start writing these lecture notes and assignments from scratch. This isn’t one of those courses that’s been taught every year for 20 years and where Ben can reuse the material. The homework problems are also brand new questions; one student who took EE 227C last spring showed me his assignments which were vastly different.
In addition to the homeworks, we had one midterm just before spring break. It was a 25.5-hour take home midterm, but Ben said students should be able to finish the midterm in two hours. To state my opinion: while I agree that there are students in the class who can finish the midterm in less than two hours, I don’t think that’s the case for the majority of students. At least, it wasn’t for me — I needed about six hours — and I got a good score. The day we got our midterms back, Ben said that if we got less than an 80 on the midterm, we shouldn’t talk to him to “complain about our grades.”
Incidentally, the midterm had four questions. One question wasn’t even related to the material that much (it was about critical points) and another was about duality and Lagrange multipliers, so that probably gave people like me who took EE 227B an advantage (these concepts were not covered much in class). The other two questions were based more on stuff directly from lecture.
The other major work component of EE 227C was the usual final project for graduate-level EE and CS courses. I worked on “optimization for robot grasping”, which is one of my ongoing research projects, so that was nice. Ben expects students to have final projects that coincide with their research. We had a poster session rather than presentations, but I managed to survive it as well as I could.
My overall thought about the class difficulty is that EE 227C is slightly easier than EE 227B, slightly more challenging than CS 280 and CS 287, and around the same difficulty as CS 281A.
To collect some of my thoughts together, here are a few positive aspects of the course:
- The material is interesting both theoretically and practically. It is heavily related to machine learning and AI research.
- Homework assignments are solid and sufficiently challenging without going overboard.
- Lecture notes make it easy to review material before (and after!) class.
- The student body is a mix of EE, CS, STAT, and IEOR graduate students, so it’s possible to meet people from different departments.
Here are the possibly negative aspects of EE 227C:
- We had little grading transparency and feedback on assignments/midterms/projects, in part because of the relatively large class (around 50 students?) and only one GSI. But it’s a graduate-level course and my GPA almost doesn’t matter anymore so it was not a big deal to me.
- We started in Etcheverry Hall, but had to move to a bigger room in Donner Lab (uh … where is that?!?) when more students stayed in the class than expected. This move meant we had to sit in cramped, auditorium-style seats, and I had to constantly work to make sure my legs didn’t bump into whoever was sitting next to me. Am I the only one who runs into this issue?
- For some reason, we also ended class early a lot. The class was listed as being from 3:30-5:00PM, which means in Berkeley, it goes from 3:40-5:00PM. But we actually ran from 3:40-4:50PM, especially near the end of the semester. Super Berkeley time, maybe?
To end this review on a more personal note, convex optimization was one of those topics that I struggled with in undergrad. At Williams, there’s no course like this (or EE 227B … or even EE 227A!!3) so when I was working on my undergraduate thesis, I never deeply understood all of the optimization material that I needed to know for my topic, which was about the properties of a specific probabilistic graphical model architecture. I spent much of my “learning” time on Wikipedia and reading other class websites. After two years in Berkeley, with courses such as CS 281A, CS 287, EE 227B, and of course, this one, I finally have formal optimization education, and my understanding of related material and research topics has vastly improved. On our last lecture, I asked Ben what to take after this. He mentioned that this was a terminal course, but the closest would be a Convex Analysis course, as taught in the math department. I checked, and Bernd Sturmfels’s Gemoetry of Convex Optimization class would probably be the closest, though it looks like that’s not going to be taught for a while, if at all. In the absence of a course like that, I’m probably going to shift gears and take classes in different topics, but optimization was great to learn. I honestly felt like I enjoyed this course more than any other in my time at Berkeley.
Thanks for a great class, Ben!
For some reason, Convex Optimization is still called EE 227BT instead of EE 227B. Are Berkeley’s course naming rules really that bad that we can’t get rid of the “T” there? ↩
I’m not even sure if I got extra credit for those. ↩
One of the odd benefits of graduate school is that I can easily rebel against my liberal arts education. ↩
On January 21, 2015, I saw an email in my inbox about an issue of Berkeley Engineering, which must be some magazine published by the university every few months. I wasn’t planning on reading it in detail, but one of the articles caught my eye. It was about a former Berkeley graduate student, Thibault Duchemin, who had just co-founded a company called Transcense (now named Ava) to break the communication barrier that plagues hearing impaired people when we attempt to talk with hearing people. Their main product is an app that can perform automatic speech recognition, so a hearing impaired person can look at his/her phone during a conversation and (hopefully) read the text to understand what’s going on.
Why did Thibault start the company? In part, it was because of his experience as a hearing person in a deaf family. (That’s rather unusual, since it’s typically the case that there’s a single deaf person in a hearing family1.)
When I was reading this article, I kept thinking about the continued importance of automatic speech recognition. Today, it is widely used in practice (as any avid Googler can tell) and is also a popular research subfield in computer science. I wish I could do research in that area, but unfortunately, I don’t think people who do that kind of research would be interested in working with me.
Needless to say, I wanted to know more, so I sent Thibault an email, and was pleasantly surprised to get a fast response. We decided to meet in person at one of my favorite cafes, Nefeli’s Cafe, located on the edge of the Berkeley campus. We chatted for about an hour in sign language. I was probably a little rusty, and there may have been some French versus English signing confusion, but we understood what we were saying to each other.
I later met a few more people from Ava since I asked to stay in touch with them. Since my meeting with Thibault, they’ve made enough progress on their product that it’s currently in beta stage and released to a specific audience. I recently tested it out with one of their other co-founders, Pieter, and they’ve definitely made progress, though they need to hone out some of the bugs we found during my session. They only have about nine people working for them so hopefully they will be able to work hard to get the app in a useful stage. By the way, here’s the link their new website.
One might wonder how their product works. I don’t know the details, but I think they use some of Google’s speech recognition software. It’s possible to design your own automatic speech recognition software (I did one for CS 288 using Hidden Markov Models) but it’s definitely far easier to use one that’s already existing, rather than build a huge one from scratch, which would require a ridiculous amount of data, and probably lots of neural network tweaking.
As I continue to fight daily doses of isolation, it’s nice to think in the back of my mind that there are people out there willing to work and help me.
That doesn’t apply to me, however. ↩
This morning, I went running through the perimeter of the Berkeley Marina area, including Cesar Chavez Park. This is only the second time I’ve done this route, but I can already tell that I’ll be coming back here every weekend.
I highlighted the running route in the following Google Maps image:
Google Maps said this was 5.3 miles, which matches what my iPhone reported me running this morning. I ran while holding my phone in my left hand and my keychain – with a lanyard – in my right.
In terms of footwear, I used my Vibram FiveFingers. I hadn’t used these shoes for more than a year, but I’m glad I had them. There’s something oddly appealing about running outdoors with minimalist style shoes. Fair warning, though: a day after my first time running this route, I had really sore calves! This is probably because running with Vibrams means we run “on our toes,” so we rely more on the calves.
Here’s how to start the route as I’ve highlighted it. The first thing to do is find a way to walk past the Amtrak. At 708 Addison Street, there is a walkway and road that intersects with the Amtrak track. Here’s an image of the railroad crossing:
Obviously, look both ways before you run across it! The Amtrak train actually appeared this morning as I was running towards it. Fortunately, the train isn’t that long (at least, the one I saw) so you won’t be waiting too much.
Once you run through that, you’ll find yourself in a small park. Look for the pedestrian bridge that goes over the highway. In the Google Maps image above, it’s where the route curves down and touches the box saying “11”. It’s a safe bridge that also has separate lanes for walkers and bikers. (A funny side note: during my first stab at this route, in typical Berkeley fashion, a few people wearing “Democratic Socialists” t-shirts were holding up several “Bernie!” banners so that drivers on the highway could see them.) Once you’re past the bridge, just continue running the route as I highlighted above. I went south first to Shorebird Park, and then north through Cesar Chavez Park, but you could easily run it in the opposite direction.
One reason why I like this route so much is that it’s really safe. Almost all of it is on a sidewalk or a bike lane that’s separate from the roads cars use. In addition, when I ran (on Saturday morning) there were a number of people there walking, jogging, or just hanging out, but it wasn’t super-crowded. I’m much more comfortable if I see a few other joggers there, since it means I know this is a place where people run.
On top of all that, there are some nice views of the bay. The upside of clutching one’s phone when running is the ability to stop and take pictures as desired. I took pictures when I was running around the perimeter of Cesar Chavez Park.
Here’s one I took that’s relatively close to a favorite restaurant of mine, Skates on the Bay:
A view of San Francisco:
And a panorama.
In all three, the Golden Gate Bridge is visible (you might have to squint at the panorama).
When I was running through Cesar Chavez Park, I noticed a few nice-looking hotels and restaurants. Perhaps these are worth checking out.
On the way back, you’ll see the Berkeley area across the water. Here’s another panorama:
Here’s part of the route on the north part of McLaughlin Eastshore State Park. This area has more dirt than the other parts of the route.
This route is definitely a keeper.
A few weeks ago, I attended the Spring 2016 BVLC retreat, which was a three-day event (Sunday, Monday, Tuesday) held in Sonoma, CA, in the Wine Country. There was a similar event last year, but I did not attend that one. BVLC stands for “Berkeley Vision and Learning Center,” but the organization recently re-branded itself as BAIR (“Berkeley Artificial Intelligence Research”). I’m a student member of the group. Check out the new BAIR website.
This post is split in two parts. The first will be a recap of my experience at the BVLC retreat. The second will explain why the BVLC retreat was nearly a disaster.
The BVLC Retreat
I took the BVLC-sponsored bus ride from Berkeley to Sonoma with other students, postdocs, and (a few) faculty. After going through typical sign-in procedures and checking into our rooms, the first major event was the bike ride (though I think only two of the faculty actually rode with us, and both are new assistant professors).
We were divided into three groups and, led by a few experienced bikers, rode across a park and a few roads to reach the lunch destination at the Bartholomew Park Winery. I was a tiny bit nervous about embarrassing myself since I hadn’t biked for a few years, but everything went well and I enjoyed the ride. I observed that other bikers were able to maintain conversations while biking; it’s hard for me to do that since I don’t want to lose focus on the bike path, so I didn’t do much talking along the way. Once we were at the winery, a humorous host instructed us on the finer points of wine-tasting and provided us with six different wines to drink. Among other things, I learned that to drink wine correctly, you need to spin/twirl your glass. Then we had a surprisingly-delicious lunch outside. I sat next to two other students I knew, and while it was tough at times to understand their voices, they were willing to repeat when needed.
After that, we biked over to an ice cream place near the hotel. I ventured into new territory by trying a bowl of cappuccino and almonds ice cream. (Why does that combination exist? I don’t know.) I tried that only because I really like almonds and cappuccinos.
Once we finished the bike ride, we (faculty included, of course) gathered in a ballroom at 4:30PM so that Angie Abbatecola and Trevor Darrell could provide some opening remarks. Fortunately, there were two sign-language interpreters, so I sat in the front left corner of the room. Sadly, most of the other students sat in the back of the ballroom, so I was surrounded by faculty and industry sponsors. (Members from companies sponsoring BVLC were invited to the retreat, such as employees of Facebook and – more surprisingly – a few guys from Yahoo! Japan.) After the opening remarks, we had brief 15-minute faculty talks about their group’s ongoing research. There was some interesting stuff here. In particular, I liked the robotics research from Ken Goldberg and Pieter Abbeel. The former’s research can be succinctly described as “cloud robotics”; the latter’s research can probably be called “deep reinforcement learning.”
Following that, we had a poster session with roughly 25 posters. I did not interact with students much, preferring to instead read the posters carefully. Halfway through the 1-1.5 hour poster session, I had memorized the high-level concepts of all the posters, and I described all of them to Ken Goldberg when he asked me to prove that I knew them.
In the evening, we had a large group dinner at the hotel. Everyone was invited: students, postdocs, faculty, and industry sponsors. I had salmon with spinach and mushrooms, and it was a good meal despite my general distaste for mushrooms.
The dinner bears some further discussion. Large group dinners have historically been some of the most difficult events for me to go through because, without any outside assistance, I cannot follow conversations at my table and feel depressed afterwards. But this time, I was smart enough to request sign language interpreting services not only for the talks, but for the dinner, so the same two guys were there. And it’s a good thing they were there; I spent half of my dinner talking to one person, a sponsor from Samsung, who seemed fascinated by my deafness. He asked me the obligatory “can I lip read?” question, and later asked how I could speak so well since he thought my speech was better than his. (Even though I’ve gone through this subject countless times, I don’t mind the attention.) Then we had some more substantive discussion on technology issues such as automatic speech recognition.
But despite how he was sitting right next to me, I had a hard time understanding his voice and looked at the interpreters more than I looked at him.
Incidentally, the dinners were “structured” in the sense that Angie and Trevor wanted (a) people to sit next to new people and (b) for all tables to discuss a common topic. Each of the attendees had name tags with a small colored dot, and we were supposed to sit next to people with different colors. (Side note: I hope next year’s event will actually write the color names rather than have a tiny dot, since we color blind people cannot tell what color we have.) But it didn’t really matter since so many students (and even faculty!) broke the rule; I saw members from the same research group sitting next to each other. Oh well.
The topic for tonight’s dinner was the impact and ethics of AI on jobs. This is an important topic because in the future, AI may rapidly displace jobs in the same way that the assembly line and industrialism replaced unskilled labor. In addition, as anyone who has read science fiction will know, there is a fear that AI can eventually become “unstoppable” and sprawl out of human control. Clearly, we have to reassure people that this will not happen. The discussion at my table was interesting, with most of it centered on how technological displacement has been happening all the time, and this is just “the next step.” In addition, my table (and others!) even mentioned that Bernie Sanders was the only “AI-friendly” presidential candidate. I do not agree with that statement for a variety for reasons, but it wasn’t surprising to see many people support that since academics tend to be liberal.
The dinner went far beyond scheduled, and I finally decided to call it a night after the interpreters stayed 30 minutes past the assigned time. (They were really nice to stay, and I was the one who had to convince them to finish up!)
I got very little sleep that night – it took me three hours to fall asleep – but I didn’t want to miss out on the 7:00AM morning hike. I woke up on time, ate some stale breakfast (no coffee, though) and boarded the bus for the hike. The hike was through Jack London State Park. We split up into a “moderate” group and an “intermediate” group; the guides said the intermediate group would have to go on a somewhat hilly route. Unfortunately for the guides, far more people wanted to be on the “intermediate” hike, so we had to split up further. (I was obviously part of the intermediate group, since I didn’t want to give the impression that I was in poor physical shape.)
The hike itself was much easier than “intermediate,” but that was fine with me. It was nice to be outdoors and to forget about academics. I didn’t talk much, but it would have been hard to have a consistent conversation with someone while avoiding the animal droppings on the trail, since I have to look at the person to understand them well.
Upon arriving back to the hotel, we had another set of five faculty talks and a poster session. For this one, I brought an old poster describing a project from last fall, but it sadly didn’t seem to be that popular among the attendees. Following the poster session, we had all (or most) of the industry sponsors give brief talks about their company, but some were just advertising their job openings. I thought the most interesting presentation came from Facebook’s employees, who described an app that helps the blind “see” through photos. Notice the date of that article!
After the sponsor talks, we had three “breakout” sessions where we gathered in smaller groups to discuss a more specific subset of AI. There were three sessions: (1) natural language & vision, (2) deep reinforcement learning, and (3) CAFFE. I sat in the natural language & vision session, and we talked about the usual object recognition issues, but there was also some interesting stuff about automatic image captioning. I’m aware that there’s research going on in that area (especially in Trevor Darrell’s group) but I haven’t read any of the papers in detail.
Then, we had one of the more unusual events in the afternoon: a wine-blending competition!
The rules were simple. Each group was given the same set of four red wines: Cabernet Sauvignon, Merlot, Franc, and … something else I can’t remember. We had to choose a blend of the four wines to form a new wine, with the constraint that we could not have more than 50 percent of the blend come from one of the wines. Each group would nominate their “best” wine, which our hosts would then shuffle in private and distribute into glasses for three groups of four (since we had twelve groups total). Then, each group nominated a wine-taster, who would match up with three others in the first round (with their three wines) to taste all four wines and rank them from best (1) to worst (4). This meant that every wine taster was guaranteed that his/her group’s wine would appear in his/her group of four, so groups had to make wines that both were really good, but ideally would also be easily detected by their group’s wine-taster, so he or she could rank it as (1).
My group nominated Alexei (Alyosha) Efros as our wine-taster, and we won the first round! In our group of four, Alexei correctly picked our wine first, and the three other wine-tasters from the three competing teams each picked our wine as (2), so our total was 1+2+2+2 = 7 points. It’s hard to get better than that! We advanced to the second (and final) round along with the two other winners from their first round groups, and a fourth “wild-card” team which had the best score of any of the non-first place teams.
Sadly, our victory was not to be. Alexei picked our wine last, so that added four awful points to our score. In fact, the three other wine-tasters really liked our wine, so much that we came in second place, I think with nine points (4+2+2+1). Yeah, I should have volunteered to wine taste.
One of the more amusing aspects of the competition was coming up with our team names, and seeing everyone’s reaction when our host read them aloud. Many of the names (predictably) had some form of “deep” in them – my team name was “Deep Drink,” courtesy of Anca Dragan. Our host was immediately suspicious, and thought there had to be some deeper meaning of the word “deep.”
We then had our second dinner, but this time it was at a golf course, and it was preceded by a one-hour reception. As my interpreters would not show up until after the reception, I knew it would be difficult for me socially, so I asked a few students I knew to stay with me. Though those students were foreigners, I could understand them since I led us away from the crowd. The subject of the night? American politics! Obviously, I was the one who initiated the conversation. The funny thing about this was that an international student told me I was the first American to talk to him about politics despite how he has been at Berkeley for three years.
The dinner after the reception was surprisingly similar to last night’s dinner. We split up into similar-sized tables, I ate fish again, I had two sign language interpreters there, and rules were broken: in my table, five people who knew each other well sat next to each other, and we didn’t talk about the “designated topic” for the night, which was about working in academia versus industry. Once again, both interpreters were very nice and stayed past their assigned time (9:00PM) without me asking them.
The Tuesday morning was more of the same – breakfast, followed by faculty talks, followed by a poster session, then some closing statements, then lunch. A lot of people who drove here went away Tuesday morning before the closing events. During lunch, I did not have interpreters, but it was only for one meal and I can manage (like how I’ve been “managing it” my whole life). I sat next to a man from Yahoo! Japan, and with him being Japanese, it was tough to understand him, but we got some basic conversation going.
Then, at last, I boarded the bus back to Berkeley.
What are my thoughts on the retreat? It went much better than I expected, but this is in part because I have such low expectations that it doesn’t take much to make me happy. (I wasn’t happy all the time, though.)
I think the key for my positive experience was the two dinners. I told this to one of the students who had attended the retreat, and he told me his experience was the opposite: he did not enjoy the dinners, because he could only consistently understand the people who were sitting next to him. And yes, he is hearing, and a few other people I spoke to also confirmed that the noise was an issue for them.
In contrast, I had six pairs of ears that night. The weakest two – but still better than nothing – belonged to me. The other four belonged to the two other interpreters, one of whom sat across from me and thus was able to follow conversations at the opposite end of the table. For one of the few times in my life, I was actually better off during a crowded dinner setting compared to hearing people. I felt ridiculously happy being with my sign language interpreters and could forget about my past frustrations with these dinner experiences.
And yet … the sign language interpreters almost never made it there.
What Almost Happened
On February 22, Angie Abbatecola sent a joint email to members of BVLC asking them to sign up for the retreat. I looked at the agenda and was excited. I did an informal cost-benefit analysis and thought that, particularly because I didn’t attend last year, I better go this time. I RSVP-ed and sent an email to Angie and to Berkeley’s Disabled Students’ Program (DSP) to inquire about accommodations.
DSP’s initial response was that they were unable to pay for interpreting services since it was not directly related to my coursework, but they would investigate their options and contact me later. I was puzzled at this assertion, because I had gotten services before for research-oriented events, and this (even though it is a social event) definitely qualifies as a research event. I immediately followed-up with an email reply saying that this was for a research group and that I wanted to go primarily because I needed to be more involved with the research community and reduce my isolation. I also asked if BVLC would have to pay for the services.
I didn’t get a response.
A week went by, and I sent two more emails asking for an update and/or clarification. For one of the emails, I was told that DSP was still searching for interpreters. Fine, I assumed. There’s plenty of time.
But then another week went by without an update. I sent an email asking for an update, and got no response.
Then Spring Break (March 21 - 25) arrived. I had sent a sixth email (counting from the original email replying to Angie) just before the break started, but I then realized that the staff would probably not work over the break. Uh oh.
The day I returned from Spring Break, which was a week before the retreat would start, I decided I could not wait any longer and marched over to DSP’s offices in person, demanding to know why they had not been able to arrange the interpreting services after four weeks. The staff member there apologized for the delay, and said that it was because the agency DSP uses, Partners In Communication, does not arrange for interpreters to venture beyond San Francisco, Berkeley, and San Jose. Sonoma is roughly an hour and a half’s drive north of Berkeley.
Fortunately, DSP just found another agency that they could use to arrange for interpreting services. I gave them another copy of the retreat agenda, and highlighted the specific sections for which I was requesting accommodations. I didn’t request services for everything, of course, since it didn’t make sense for some of the events (e.g., the bike ride). In addition, since this was happening on short notice, I figured if I requested fewer hours, the likelihood of the requests being fulfilled was greater.
The following Tuesday, DSP formally submitted the request. But at this time, I was really worried that we would not be able to find any interpreters. I discussed this with my parents and they were enraged that Berkeley’s DSP hadn’t moved fast enough despite me giving them more than a month’s notification. They also questioned the claim that the company DSP negotiates with was not willing to arrange for interpreters to drive an hour and a half north, due to my experience at Williams with interpreters traveling long distances. We discussed my options. One of them was that I could search for an agency and pay for interpreters, and have DSP or BVLC/BAIR reimburse me later.
Needless to say, this concern over interpreting services wasn’t helping me in my ability to focus on research and homework. As it turned out, the third CS 267 homework was due around this time.
On the evening of Thursday, March 31, my frustration and stress had crossed a line. With still no word on any interpreters getting hired, I sent a joint email to Berkeley’s DSP and a few other people (Angie, some faculty), with some rather harsh words, but with the goal of trying to explain why I was feeling stressed. Here are some segments of the email:
I wanted to bring up something that’s been causing me a lot of stress lately. The Berkeley Vision and Learning Center (BVLC, though now known as BAIR) retreat is coming up soon, on April 3, 4, and 5. Unfortunately, as of right now, I still have not received any confirmation that I will have any sign language interpreting accommodations for that event. If the agency who provides the interpreting services is unable to assign anyone tomorrow, then I am not sure if they will be able to assign anyone at all, since their staff may not work on Saturdays. Angie, the contact person for BVLC, sent an email on February 22 announcing the date of the BVLC retreat. A few days later, on February 27, I sent a joint email to Angie and to [Berkeley’s DSP] outlining my general request for interpreting services for the event.
Then, after outlining my frequent reminders, I explained why I was getting stressed:
What I’m trying to explain in this email is partly that not knowing whether I have accommodations is going to affect how I feel during this event. For instance, if I know that I won’t have accommodations, then I have to carefully plan out every detailed minute and ask a variety of people to stick with me during certain events so that they can explain what people are talking about. The worst part, judging from the agenda, will probably be the dinners. I am unable to follow conversation during noisy dinner settings, so I usually end up taking turns watching one person for a minute, then switching my gaze towards another person, then I repeat the cycle.
The best case scenario is that tomorrow, all the requests are fulfilled. Still, this means I have to constantly think and worry about what will happen for this event and need to refresh my email constantly. This comes at the cost of getting real work done, and I also don’t think that most graduate students have to worry about this stuff. I have been suffering from soaring isolation and stress levels since I arrived in Berkeley, and while it’s gotten better this semester, I just don’t want (in the worst case) this event to revert them back to their fall 2015 levels.
This email was the spark that led to action. I finally saw some evidence that we were moving forward to getting interpreters. Angie and the DSP staff began a lengthy email exchange with each other to search for, arrange for, and pay for interpreters. I was copied to those emails, which was an enormous sense of relief.
My best guess, judging from these emails, is that the new company DSP found (shortly after Spring Break) was unable to provide interpreters, so we had to search for a third agency. We finally found one that was willing to hire on short notice, and filed in a request on Friday, April 1. Unfortunately, since this was so close to the retreat (and remember, many people don’t work on Saturdays and Sundays), the agency charged more for a late-day notice. Fortunately, Angie was willing to arrange the extra payment because she wanted me to enjoy my experience. Incidentally, BVLC was the organization that had to make the payment.
Even though we filed in a request and BVLC was willing to pay, there was no guarantee for interpreter availability. Surprisingly, on Saturday, I received a notice saying that the agency had found a few interpreters for some of the events. By the time I arrived in Sonoma on Sunday morning, half of the hours I requested had been arranged. Excellent! But that still left the other half unassigned …
Also on Sunday, I discovered a bewildering fact. When I entered the ballroom at 4:30PM for the opening remarks, I recognized one of the two interpreters, because he was a “substitute for the substitute” for me in CS 287 during last semester’s infamous “interpreter substitution” phenomenon.
I was curious about how this new agency was able to arrange for him to come to Sonoma. I assumed he lived somewhere near here, or at least equidistant between Sonoma and Berkeley. I asked him where he lived.
His response? Berkeley.
I couldn’t believe it. After all this time, from Berkeley’s DSP not being able to get their usual company to arrange for someone to drive north an hour and a half to Sonoma, what we finally settled on … was an interpreter who lived in Berkeley! Yes, I’m serious! Indeed, he confirmed to me that he had to drive all the way for the job. Wow.
In the end, things worked out in the nick of time, and all of the unfilled hours were filled by Sunday evening (the Sunday events were booked first, but most of Monday was unassigned when I arrived to the retreat on Sunday). I got lucky – one of the interpreters for a Monday morning event was originally scheduled to interpret somewhere else, but his assignment got canceled, so he was available.
In fact, I had interpreting services for all the hours I requested and for a few times that I didn’t request! I think this was due to two reasons. One was that there may have been some miscommunication and that Angle or DSP accidentally filed in more hours than I requested. But the second was a true surprise: the interpreters I had (as mentioned earlier) were kind enough to stay beyond their assigned hours. All the interpreters for the two dinners stayed after 9:00PM, and one Monday afternoon interpreter stayed with me for the wine-blending competition, despite how I hadn’t requested services for that. (I was going to, but since it was short notice, I thought the event was lower on my priority list.) I thanked all the interpreters who stayed beyond their hours, and I wish I could thank them again right now.
What is the lesson I learned from this? Requesting accommodations takes time, and some prodding. I lied in that blog post I wrote last month. I didn’t write it because of Teresa Burke’s essay. I wrote it during the midst of this interpreting request (note the date of the post: March 23) and I only found out about Teresa’s lengthier blog post after I remembered reading one of her older emails. Requesting accommodations takes time in part because there is lots of bureaucracy involved. There are rules that get in the way, from company policies to dealing with DSP versus BVLC payment.
But probably the worst part about these episodes is the impact on how I feel. I constantly, constantly feel like I inconvenience people. I think about that all the time, and arranging for the retreat made these feelings worse. BVLC had to pay extra money for the interpreters because of our last-minute request. The company that arranged the interpreters sent us an email describing their pricing, and the charges took a noticeable hike for a request on three days’ notification.
I didn’t compute the final cost, but my rough estimate is that BVLC had to spend a few thousand dollars for this event (perhaps one thousand for a “normal” request, and an extra thousand for the late notification). Do you think I want to be responsible for all that money shelled out? Angie reassured me that it wasn’t my fault, because I sent in the request far in advance and DSP should have acted earlier, which helped to mitigate some of my concerns.
It’s not just the money that’s involved. There are my usual concerns over whether other people get annoyed or distracted in the presence of interpreters. I’m not exactly at the top of the field, and I don’t know what I would do if a famous professor demanded that the interpreters be removed.
My concerns extend to other events in the future. As a worrisome example, what happens if I attend an academic conference? It was hard enough to get accommodations for an event located in Sonoma, CA, which is an hour and a half drive from Berkeley, CA. Imagine what would happen if I requested an interpreter for a conference in China? There can’t be too many (American/English) interpreters in China, and international flights aren’t cheap.
I’m very, very anxious and concerned about having to plan this out.
Hopefully this explains why I thought the retreat was a “Disaster Averted” moment for me. It was shaping to be awful, but somehow, someway, things ended up better than expected. Moreover, I even finished that CS 267 homework in time. Whew. But why do I need to go these experiences?
My hope is that, one of these days, I’ll be able to enjoy going to gatherings and similar events without having to constantly worry about accommodations, payment, inconveniencing people, and socialization.
Though I do not embrace Donald Trump politically, there is one aspect of his campaign to which I am sympathetic. One reason why Trump has been successful beyond most early predictions is because he appeals to the frustrations of working-class whites who lack a college education. The demographics of his supporters have been widely covered and verified from sources such as The Atlantic, Politico, and FiveThirtyEight.
I can relate to those voters to some extent because I have my own frequent frustrations, though mine are of a vastly different nature than the ones afflicting his supporters. (While I do not make much money as a graduate student, I have accepted this trade-off for the opportunity to build my computer science skills and do not view my income as an issue.) I’ve thought about this surprising “juxtaposition of frustrations” for a few months.
I’ve also wondered about what would happen if there was a politician who could directly appeal to my frustration. To be clear, I don’t think any politician could or would want do that. Politicians, for better or worse, have to speak to large groups of people who tend to vote together, because that’s where the votes will come from. Donald Trump needs the support of working class whites, who (despite their relative decline in the share of the population) still compose a substantial fraction of the electorate. A similar case is happening with the Democratic party; Hillary Clinton has to appeal to the minority vote because non-whites heavily vote Democratic. It’s not bad politics for politicians to do that, and if I were a politician running for a prominent elected office, I would do the same thing. It just means that people like me or others who feel excluded from politics may feel left out, as covered by this NY Times article1.
As stated earlier, it is unlikely that a politician would be able to directly appeal to me. In fact, even if someone did do that, I am still not sure if I would vote for him or her. Politicians across both political parties are notorious for making extravagant promises that don’t materialize.
My purpose in outlining my thoughts here is partly to raise some thought-provoking questions on how politicians can appeal to as much of the population as possible in the midst of conflicting goals among voting blocs. One challenge is that there is an enormous spread of economic power among people within blocs. Those who may lack opportunity, but who nonetheless fall into a group of people who have historically had advantages in our society, may feel resentful that their voices are ignored. It’s a delicate balance to try and address their concerns while also ensuring fairness and equality as much as possible, and to counter the perception that addressing one group of voters (e.g., working class whites) might alienate other groups of voters (e.g., the African American voting bloc). I don’t have the answers for this. Being a politician must be an incredibly difficult job.
Again, I disagree with Trump politically, but I can understand the frustrations some of his supporters may feel. I think it is important that we not ignore them, who (aside from the Ku Klux Klan) are reasonable American citizens. I hope that his campaign, while controversial, will have a positive long-term effect in that politicians across the political spectrum will be more sensitive to the needs of people who feel politically ignored.
A quick note: I am technically not a white male because I am half Asian, but Asians receive far less attention in politics than African Americans and Latinos/Hispanics, for obvious reasons: we don’t have as much voting power. ↩
The title of this blog post is the name of a preprint recently uploaded to arXiv by several researchers at Berkeley, including Ben Recht, who has taught two of my classes (so far). Judging by the arXiv comments, this paper is in submission to COLT 2016, so we won’t know about its acceptance until April 25. But it looks like quite a number of Berkeley people have skimmed this paper; there was some brief discussion about this on a robotics email list. Furthermore, there’s also been some related work about gradient descent, local minima, and saddle points in the context of neural networks. I’ve read two such papers: The Loss Surfaces of Multilayer Networks and Identifying and Attacking the Saddle Point Problem in High-Dimensional Non-convex Optimization. Consequently, I thought it would be interesting to take a look at the highlights of this paper.
Their main contribution is conveniently outlined in a single obvious paragraph (thank you for clear writing!!):
If is twice continuously differentiable and satisfies the strict saddle property, then gradient descent with a random initialization and sufficiently small constant step size converges to a local minimizer or negative infinity almost surely.
Let’s make it clear what this contribution means:
We’re dealing with the gradient method, . It’s nothing too fancy, and the constant step size makes the analysis easier.
The sufficiently small step size means we want where is the Lipschitz constant. In other words, it satisfies the well-known inequality for all and . I have used this inequality a lot in EE 227C.
The strict saddle property restricts so that every critical point (i.e., those points such that ) is either (a) a local minimizer, or (b) has . It serves to restrict because other functions could have critical points where all the eigenvalues are zero. Note that since the Hessian is a symmetric matrix, all the eigenvalues are real numbers. In addition, a local minimizing point means the eigenvalues of are all strictly positive.
They claim that the gradient method will go to a local minimizer. But where else could it go to? There are two other options: saddle points, and local maxima. Gradient descent, however, cannot go to local maxima because it is by definition a descent procedure, unless (I think) for some reason we’ve initialized as a point that is already a global maxima, so and we get nowhere. So the only thing we worry about are saddle points. Thus, if “saddle points are not a problem” as suggested in the paper, then that therefore means gradient descent converges to local minimizers, as desired.
It’s worth discussing saddle points in more detail. The paper “Identifying and Attacking…” uses the following diagrams to provide intuition:
Image (a) is a saddle point of a 1-D (i.e., scalar) function. Images (b) and (c) represent saddle points in higher dimensions. They are characterized by the eigenvalues of the Hessian at those critical points. If all eigenvalues are non-zero and either strictly positive or strictly negative, then we get the shape of (b) with a min-max structure. If there exists a zero eigenvalue, then we get (c) with a degenerate Hessian. (Recall that a matrix is invertible if and only if all its eigenvalues are non-zero.) Image (d) is a weird “gutter shape” which also results from at least one zero eigenvalue. I’m not completely sure I buy their explanation – I’d need a little more explanation for why this happens. But I suppose the point is that the authors of “Gradient Descent Converges to Minimizers” don’t want to consider degenerate cases with zero eigenvalues. It must make the analysis easier.
Section 3 of “Gradient Descent Converges to Minimizers” provides two examples for intuition. The first example is , where and has no zero components (and hence no zero eigenvalues) but it must have at least one positive and at least one negative component. Otherwise, we wouldn’t have any saddle points! By the way, the only critical point for this function is , as if and only if .
The gradient update is . Applying this recursively, we get . More specifically, the iterates take on the following form:
Indeed, an analysis of gradient descent with shows that gradient descent will only converge to if the initial point is in the span of where represents the number of strictly positive eigenvalues (so ). Remember: we don’t actually want to converge to that point, since it is a saddle point! But fortunately, as , if we randomly initialize appropriately, the only way our iterates converge to the zero vector is if all components from to were exactly zero, and the probability of that happening is zero. Great! We don’t converge to the (bad) critical point! We converge to … a better point, I hope. (The paper uses the term “diverge” but I get uneasy reading that.)
The second example is . Finding the explicit gradient update is straightforward, and is provided in the paper. They also explicitly state the three critical points of . Their argument is similar to the previous example in that they can reduce the cases of converging to an undesirable saddle point to a case which would require initializing a certain component of the starting 2-D point to zero, which cannot happen with random initialization (well, the technical way to say that is “zero measure” …).
I still have a few burning questions on these (plus some of the other stuff mentioned in Section 3) but I’ll hold off on writing about those once I have time to get to the meat of this paper, Section 4. In the meantime, it will be interesting to see what kind of work gets built off of this one.
One of the things that I’ve been a little frustrated about lately is the time it takes to arrange and obtain academic accommodations, such as sign language interpreting or captioning services. I can’t just show up to a lecture or an event and expect a sign language interpreter to be there. I have to explicitly request the service, and there are many reasons why this process might get delayed.
Before agencies or institutions provide the service, I have to prove that I need the service. This means, at minimum, I need to provide them my audiogram, and they might need some additional background information about my education. Sometimes an interview is required; I had a remote interview with a Berkeley DSP employee before I had arrived for my first classes.
After the initial registration hurdle, I can start formally requesting accommodations. To schedule an accommodation for a campus-related event, I have to fill out an online form with information about the time, the location, and other stuff. Berkeley’s gotten better with the forms, as they’ve implemented extra features that help to counter my earlier criticism. On the other hand, there can still be a noticeable delay between when I submit the form and when I get responses, and I have to keep reminding myself that weekends and vacations do not count as “real days” when counting how many days in advance to submit a request.
In some cases, it can be extremely annoying to schedule accommodations for one-time events. If it is the first time that I am participating in an event, then I usually don’t have much information on the setting or environment, and it is not always clear if there will be one speaker (which is easier for an interpreter) or a debate with people shouting simultaneously. In addition, I often need to have a detailed schedule of the event, and it’s common to have people wait until the last minute to finalize schedules. I’ve had to send lots of emails to remind others that I need a detailed schedule ASAP, and people hate to see “ASAP.”
Finally, it’s not clear how much accommodations can help in practice. I’m not counting cases when there’s some kind of mistake in the scheduling (they do happen sometimes, as in my prelims). I’m considering cases when they work normally, but they simply do not produce any benefit. For instance, when I took CS 288, I had captioning services. In general, they worked as intended (well, not always) but it was extremely hard for me to follow and understand concepts based on real-time, imperfect captions.
I should note, of course, that I’m not the only one who has mentioned this. In fact, I was actually inspired to write this short piece after reading a longer essay by Teresa Blankmeyer Burke, an Associate Professor of Philosophy at Gallaudet University who is deaf. Her blog post covers on some of the themes regarding the time it takes to schedule accommodations. I think her experience is similar to mine: lots and lots of emails to write and forms to fill.
Nonetheless, despite the annoyance of scheduling accommodations, it is important for me to look at the big picture. First, I usually get the accommodations, which is something that not every deaf person in the world can say. In addition, even when accommodations do not work that well, I know that the people providing them are trying their best to help me, and I appreciate that.
I often think about some of my lifelong regrets, probably because I’m in a stressful period of my life.
What could I have done different? Would I be a much better person today if I had done this instead of that? Why didn’t I think about doing such obvious acts earlier?
Hopefully if I list them here, I can look back at this blog post periodically and ask myself if I’m making progress towards mitigating my constant guilt over these regrets.
Here are ten of my major lifelong regrets:
(1) I did not do enough math, statistics, and computer programming, both during college and (especially) before college. To be clear, I have been a good student my entire life, getting mostly top grades in the hardest courses available to me. But gradually, it became clear to me that I was just an “average good student”, and at Berkeley, there are a lot of “better than good students” who boast ridiculously long lists of math/programming accomplishments, and long lists of graduate-level courses taken.
I am constantly thinking about how I have to study a certain concept many times or take an extra class because I need to “catch up” to far more experienced students (in my year). Looking back, I wish I had taken all of my high school classes two years earlier than when I actually took them, which would have given me a bigger head-start in college. And programming? Upon graduating from high school, I couldn’t make a simple “Hello World” program, whereas other Berkeley students (and this is especially common among international students) were busy winning programming competitions in high school.
In Outliers (more on that later) Malcolm Gladwell describes how hockey players who were born near January 1, and thus had the size edge during youth leagues, are more likely to reach the highest level of the sport than other guys born at different times of the year. This is because being good early leads to snowballing advantages. This “snowballing” is what I wish could be an advantage for me, not a disadvantage. It’s also partly why I don’t think that just taking courses makes it easy to catch up, as (to take an example) professors would rather work with students who have already taken graduate courses in their research area over “riskier” students who have to take those courses and who may not like them or may not do well in them. It’s really hard to catch up.
(2) I often did not make it clear to others that I was deaf, in part because I was embarrassed by it. I discussed my uneasiness in telling people that I was deaf in a blog post a few months ago, but here, my focus is on my pre-college life. Starting from middle school, which is when I first became conscious of my dreadful social hierarchy position, I constantly tried to hide my deafness by not signing in public and by focusing on my teachers instead of my sign language interpreters during classes. In high school, I expressed little enthusiasm in discussing “deafness” with anyone. In my senior year, it was awkward for me to write my college essays, since my parents were adamant that I should write about being deaf. (My difficulties in expressing my thoughts probably explains why I didn’t get into many colleges: lackluster essays plus lack of impressive extracurricular activities.) Fortunately, by the time I got to college, I had learned to watch the interpreters more often, but I still don’t generally tell people I’m deaf when we meet for the first time. It’s still a little awkward.
(3) I spent too much of my life emphasizing sports, either playing sports or following sports-related news. I have spent many hours doing organized soccer, baseball, basketball, skiing, and to a lesser extent, ultimate frisbee and track & field. In addition, during down-time, I would often read popular sports websites such as ESPN and NBA.com. But I think I could have put that time to better use, because sports haven’t exactly been the greatest thing for me. Some people join sports to get to know other people, but I don’t think I made a single friend out of being on a sports team. In addition, sports were often a source of stress in my life. I was usually not among the top players on my sports teams, and I constantly worried about screwing up and embarrassing myself. Finally, and probably most importantly, I’m not sure I genuinely enjoyed sports. When my high school soccer teams won important games (or scored game-winning goals), I was one of the least enthusiastic players on the team during the celebrations. While other players might hoot and holler and pile up upon the player who scored a winning goal, I would quietly do a few token jumps.
(4) On a related regret, I did not do enough to improve my physical fitness. This is not the same as playing a sport; it’s about the work of weight lifting to get stronger and running to improve stamina. Speaking as someone who’s played a lot of sports, I can definitely vouch for the importance of physical fitness and conditioning. Consider this: if someone doesn’t have the foot skills to handle a soccer ball well, but has incredible speed and strength, that player could be a solid defender on a good soccer team.
I have this regret mainly because I was never among the most athletic players on my high school teams. (I know that a lot of this is genetics, but genetics doesn’t explain everything.) When I was on the high school soccer team, for instance, I was regularly among the slowest long-distance runners when we ran laps and probably the slowest sprinter on the team. And, while I had tried going to the weight room often, I was unable to really notice any strength difference. That changed once I had read Starting Strength and Stronglifts in college and got to see noticeable gains in my weight lifting and overall strength, but that begs the question: why didn’t I know about those resources before college? Fortunately, I’ve gotten a little better at working on my strength, but I’ve also been lagging behind on my running.
(5) I did not have a good diet until I was around 21. The biggest reason why I consider my my diet to be so bad was because I ate a lot of refined carbohydrates: lots of pizza, plain bagels, white rice, and (sometimes) white pasta and white cereals. Furthermore, even if I had always gone whole wheat for these, having a diet that is 90 percent based on whole wheat does not count as a good diet. I used to eat from Subway a lot, which has heavily processed meat. I also would drink a lot of diet soda, which are almost as bad as regular, sugary soda. What I should have done was emphasize lots of fruits and vegetables, lots of (properly-prepared) meat, and lots of eggs. Of course, all of these have to be cooked and prepared properly, especially in the case of meat. For this regret, I’m happy to say that I’ve made a lot of progress in overcoming my guilt over this. When I was 21, I forced myself to overhaul my diet, and it’s now far more rich and nutritious today than it was a few years ago.
(6) I spent too much time playing video games and computer games. I’ve played a variety of games in my lifetime: sports games, real time strategy games, turn-based games, shooter games, building/tycoon games, and others. The two that I have probably played the most are Age of Empires II and Civilization IV. In middle school and high school, I spent way too much time playing them than is healthy, sometimes spending ten hours a day when I didn’t have school. I guess one reason why I liked these games so much was that they were strategy games designed to test my mind, that they were related to designing and building empires, and that they were just a whole lot of fun. In addition, these games do not require me to understand any dialogue that happens in them. There are lots of in-game sounds, but that’s what they are: sounds, not words, which are harder for me to discriminate. Fortunately, while I still play some of these games once every few weeks, I no longer have the immediate urge to play a game whenever I have free time. I think I grew out of those during my college years.
(7) I didn’t read enough educational books in my spare time. It’s important to be clear about what I mean here: books assigned as class reading and books that can roughly be described as “non-educational” (e.g., comic books, books describing how to play games, most novels, etc.) do not count. I mostly want to read non-fiction books that cite relevant literature to back up their points. One of the few books I did read that satisfies my non-fiction criteria is Outliers, which I brought up earlier in point (1). It’s a testament to the book’s quality that I still remember a lot about it after eight years, but it’s also been a source of frustration for me. Outliers proposes an interesting “10,000-hour” rule, where one has to spend that amount of time deliberately practicing a skill in order to master it. But it cites Bill Gates as an example, who by the time he had arrived at Harvard as an undergraduate, already had lots of experience working with computers (and remember, this was back before they were commonplace). When I look at my life, I wish I had gotten a head-start on those 10,000 hours on a certain area; usually, I think of programming.
Fortunately, I now have gotten a lot better at reading more books. I have read fifteen books this year (so far) and plan to write up a summary of each book I’ve read in a giant blog post at the end of this year. Most of the books I read are well-regarded non-fiction books that relate to real-world subjects of interest: foreign policy, history, technology, psychology, and other areas. But I still feel like I am reading all these books partly to make up for lost time.
(8) I spent too much time browsing random websites and message boards. In part, this was due to my obsession with playing games. For instance, I have almost 6,000 posts on the Civilization Fanatics Forum and was known as one of the top single-player Civilization IV players on the forum. (Yeeeaah … I was really obsessed with that game!) I also posted on other message boards in addition to game-related ones. Sadly, College Confidential was one of them1. In part because I don’t play games that much anymore, I have been a lot better in avoiding message boards. In addition, because I have so much on my plate now in terms of research and coursework, I spend far less time aimlessly browsing the Internet.
Nowadays, there are only a handful of websites I check on a regular basis, and if they are blogs or news-related, I try not to check them until the evening. I deliberately have only a few websites bookmarked on Google Chrome, and I don’t spend much time reading other people’s blogs as I used to. Oh, and what about Facebook? Don’t worry – Facebook was actually one of the earliest sites that I was able to resist checking.
(9) In college, I was not aggressive enough in reaching out to other students to work on homework together. I think part of the reason for this is that, for some time, I actually wanted to do homework by myself. To be clear, I was not ignoring requests to work together; I was simply not active in reaching out to other students. I thought that if I worked on my own, I would avoid distractions and learn faster. That worked for a few courses, but as the material became more advanced, I needed to talk to more students, and it was hard for me because I lacked a social base. I relied almost entirely on TAs and professors for assistance with coursework. Fortunately, I’ve now completely changed my stubborn “work alone on homework” strategy and have found other students to work with during classes in recent semesters. As a bonus, my homeworks have generally improved.
(10) This is the most recent regret I have, focusing on my experience during the past three years. For some reason, I’ve (hopefully temporarily) lost the capability to ignore my isolation. I have let it adversely affect my mood and productivity and I worry about how others view me. It’s true that being able to do better in my courses and, especially, getting some research papers would help me combat my constant obsession about isolation, but at the moment I need to figure out how to ignore these thoughts. I think part of it has to do with growing up and getting older; I have higher expectations for myself, both socially and academically, and I want to aim high.
Hopefully in five more years I can look back at some of the progress I’ve made. As covered earlier, I have made some progress on overcoming some of the constant guilt I feel about myself. I just want to be a better person and not feel like I am constantly in “catch-up” mode with regards to my life.
College Confidential is one of the most depressing places on the Internet. Please don’t go there. ↩
When I’m doing some homework, research, or trying to learn a concept, I want skill to be my limiting factor. What do I mean by this? I mean that if I’m having difficulty doing some work, the reason should be straightforward: the work is challenging, and I need more skill (or more accurately, that plus lots of focus and effort) to be able to accomplish my objectives.
Unfortunately, in recent times it has not been skill that is my limiting factor, but how well I feel.
I’m writing this post after a week when I was stressed over feeling isolated, both in terms of research and in terms of social settings. The former is because I haven’t made much research progress, and I feel like I’m cut off from the research community. The latter is, well, kind of obvious.
When I tried to do work this past week, the limiting factor was how well I could bring myself to focus and keep my thoughts about isolation at bay.
Almost all of my negative experiences, almost all of my sources of stress and depression, can be traced back to some sort of isolation.
This explains my frustration, which is another suitable word to describe my Berkeley experience so far. I feel like I’m capable of so much more, except I keep getting distracted. The result is that I feel like I’m in a deep hole and I can’t climb out.
I can’t believe that most people want to feel this way. As far as I understand, humans are social creatures, and people want to feel like they belong. I view myself as a social person, even if I sometimes cannot convey that message clearly enough. I think about social settings all the time; it’s a common theme that appears when my mind wanders. Unfortunately, reality usually strikes a few moments later, in the sense that there are many people who I want to talk with, but I don’t feel like I can talk with them. I might think they’re out of my league socially, or that communication would be difficult for some reason (e.g., accents).
Fortunately, whenever I do get an extensive conversation with someone, it’s enough to keep me refreshed and deplete my internal “isolation meter” for a few days.
The bar for me being happy is set really low.
In the meantime, I have to learn how to stay positive, and I’ll continue searching for people I can work with, hoping that … at … some … point … I can find that true, collaborator to answer my dreams.
While I have access to many advanced and high-quality classes at Berkeley, sometimes I need or want to review foundational topics to make sure I really get the material. I skipped (or did not do well in) some early prerequisites for upper-level computer science and math courses, so I constantly feel like I have to make up for that in my own time.
In this post, I describe four classes that I have self-studied to a substantial extent. Two are from MIT’s Open Course Ware (OCW), and two are from UC Berkeley. All four of my self-studying pursuits were enormously beneficial to me.
Here are the courses listed according to the time I self-studied the material (“My Time”):
18.440, Probability and Random Variables
- Institution: Massachusetts Institute of Technology
- Professor: Scott Sheffield
- Course Offering: Spring 2011
- My Time: Summer 2012
To prepare for my probability class at Williams, I decided to go through MIT’s version of the same class. (In fact, I even made a blog post about this.) Unfortunately, the version on OCW doesn’t offer any lecture videos, so I instead went through all the lecture slides and made sure to understand basically everything covered there. I also took both practice midterms and the practice final, which were surprisingly easy.
To increase my understanding as I was studying the OCW materials, I also read a draft of The Probability Lifesaver, written by Steven Miller (a professor at Williams College).
CS 61B, Data Structures
- Institution: University of California, Berkeley
- Professor: Jonathan Shewchuk
- Course Offering: Spring 2014
- My Time: Summer 2014
I went through all of the class lecture notes from the course website (accessible through Shewchuk’s homepage) and made sure to study them well. I went through all of the homeworks and labs, and made some progress on all three of the major projects. I didn’t complete them – I just got to a stage where I knew I had made a lot of progress and felt that I understood the purpose of the project. One thing I wish I had time for was to do more practice exams, especially those from Paul Hilfinger’s versions of CS 61B (a.k.a. the harder versions).
This class was super helpful for me because I never had a strong data structures education, so I reviewed a lot of concepts that I had implicitly assumed were true but didn’t know why. Reviewing CS 61B made it possible for me to do the tough Java programming assignments in CS 288.
18.06, Linear Algebra
- Institution: Massachusetts Institute of Technology
- Professor: Gilbert Strang
- Course Offering: Spring 2010
- My Time: December 2014 and January 2015
This is a fairly popular MIT OCW course, in part because of the reasonably high quality video lectures. Gilbert Strang lectures at a relatively slow pace, which is fine with me because I prefer slow-paced lectures (and tough assignments). I went through all of the video lectures and made sure to understand them as much as I could, which alone was enough for me to feel like I learned a lot. I briefly read some other related class handouts, but most of the time, my supplemental learning resource was … Wikipedia.
In the future, I should do some of the practice exams.
CS 188: Introduction to Artificial Intelligence
- Institution: University of California, Berkeley
- Professors: Pieter Abbeel and Daniel Klein
- Course Offering: Spring 2012, 2013, and 2014 (varies)
- My Time: Summer 2015
This is a popular undergraduate CS course, both within CS and outside of CS for student who want to “try out computer science.” Fortunately, the class has a lot of material online. I went through all 24 video lectures, but I used different years depending on which YouTube videos had auto-captions and/or which ones had louder sound. Each lecture came with detailed (and humorous) slides, so I read all of those as well.
In addition, I think I took about 15 practice exams (yes, that is not a typo) and rigorously checked my answers with the solutions. It was probably overkill for me, but I really wanted to know this material well. To my disappointment, I noticed that certain questions were recycled (in some form) year after year. Thus, I can’t wait to be a GSI for this class later so I can ramp up the difficulty of the exams by not using those kind of questions.
I plan to continue my self-studying pursuits during the upcoming summer. Here are the courses on my “self-study radar”:
At UC Berkeley, CS 61C, Machine Structures. This in in progress … but barely. I’ve personally resolved that I won’t do any other self-studying of a computer science course until I finish this one. Knowing this material down cold is simply too important for me. After this, I can branch off to other, more advanced areas such as self-studying operating systems.
At MIT, 8.01, Introduction to Physics. I’ve never taken a physics course before and I have not started doing this. The downside is that it might be tough to find practice material since MIT had to pull down the material due to Walter Lewin’s inappropriate actions. The videos are online, but I may have to do some searching for the assignments and exams.
I don’t have anything on my radar for math and statistics, in part because probability and linear algebra are so ridiculously important, that if I really wanted to do any self-studying, it would be better for me to actually re-re-study those two courses! In fact, I should probably be doing that this summer anyway.
Yesterday, I finished reading Atul Gawande’s fascinating 2002 book Complications: A Surgeon’s Notes on an Imperfect Science. This is the first of his four books – all of them bestsellers – and I read his other three books earlier this year. I’m staying on track to read by far more books than I had planned to as part of my New Year’s Resolution, so that’s nice. Also, unlike in December 2015, when I only discussed the top three books I read, in December 2016, I plan to cover all of the books I’ve read that year. I’ll do it in one blog post, with one paragraph for each book, plus some additional commentary. Complications is already the fourteenth book I’ve read in 2016, so that blog post will be super-long. (It’s currently in a draft state behind the scenes so that I don’t have to write it all at once.) Stay tuned until December 31, 2016, everyone!
But anyway, I wanted to comment on a particularly interesting portion of the book. In a chapter describing a pregnant woman’s intense nausea which mystified her doctors (hence the title of the book), the following text came up:
In 1882, the Harvard psychologist William James observed that certain deaf people were immune to seasickness, and since then a great deal of attention has been focused on the role of the vestibular system—the inner ear components that enable us to track our position in space. Scientists came to believe that vigorous motion overstimulates this system, producing signals in the brain that trigger nausea and vomiting.
My first reaction was: hey, this is pretty cool! And this was done in 1882? Really?
Just to give my personal experience: while I’ve occasionally had mild cases of nausea, I can’t recall ever feeling sea sickness1, or any kind of motion sickness. I’m a huge fan of roller coasters, for instance, and I can ride them often without feeling sick. To make the point clear: when I was in eighth grade, I rode the Boomerang Roller Coaster at The Great Escape twenty-five times – in one day. (Ahh … those memories of empty lines and being able to quickly exit the coaster and race to get back on.)
It’s pretty cool to think about being immune to something. My mind wandered to thoughts about whether deaf people might be immune to things such as deadly diseases. I remembered that passage at the start of The Death Cure when the Rat-Man told Thomas2: “The Flare virus lives in every part of your body, yet it has no effect on you, nor will it ever. You’re a member of an extremely rare group of people. You’re immune to the Flare.”.
I thought about some of these things as I read through the rest of Complications, so after I finished the book, I decided to briefly investigate further. Here is what I found.
First, it’s clear that it’s not deafness that causes the so-called “immunity” but, as Gawande points out, the condition of the vestibular system. I’m not sure as to whether a weakened vestibular system is the cause for my deafness. I was deaf since birth and as far as I know, there’s no explanation for it besides the randomness of genetics.
One of the most commonly cited sources for this fact (or “myth” as some might call it) is an old 1968 paper called Symptomatology under storm Conditions in the North Atlantic in Control Subjects and in Persons with Bilateral Labyrinthine Defects. Yeah, the title is pretty bad but the paper showed that in an experiment, a few deaf people did not experience seasickness.
The original source, William James’ 1882 “study” is called “the sense of dizziness in deaf-mutes”, but I can’t figure out a way to access it; it’s trapped behind several websites that restrict access (ugh). I can’t even use my Berkeley credentials. All of my knowledge about that paper therefore comes from third-party sources.
Almost all other sources about this topic are from really random and ancient research or (worse) newspaper articles. Here’s one example from a 1986 article on the SunSentinel, and it’s pretty lame. Also, I tend not to trust articles that appear on ad-heavy websites.
So yeah, there isn’t that much focus on deafness so far as the vestibular system. Shucks.
As I was reading through these ancient sources (well, the ones I could access), I also wondered about the evolutionary benefit of being deaf. I can’t think of any, unless deafness somehow came with another benefit to counteract its negative effects. I hope it’s some secret immunity.
I mean, think about how bad life would have been during the years humans have lived. If I had been born in 1800, for instance, I wouldn’t have had access to the high-quality hearing aids I’m wearing as I type this blog post. In fact, the best kind of “hearing aid” I could have used would be those terrifying (and ineffective) ear trumpets. Ugh.
Going further, consider the prototypical “cavemen”. For them, having good hearing would be more important than it is for humans today; there was no sort of disability law and little to no visible communication mediums (e.g. writing) to compensate.
This line of reasoning could also extend to other disabilities. Why do they keep appearing in our population? A quick Google search of “evolutionary benefit of disabilities” resulted in several random, small news articles after another, hardly convincing evidence. Another, non-disability related one might be homosexuality; indeed, that was one of the choices of text that Google suggested for me when I was typing “evolutionary benefit of”. It seems to be fairly accepted that homosexuality is not a choice, but then this raises the question: what is its evolutionary benefit? And what about, er, this kind of stuff? All right, that’s enough of this thinking for today.
At the end of the first CS 267 (Applications of Parallel Computing) lecture, I was looking forward to the rest of the class.
Well, after three more lectures, I’m probably done attending them for the semester.
No, don’t worry, I’m still taking the class1, but I negotiated an unusual accommodation with Berkeley’s Disabled Students’ Program (DSP). All CS 267 lectures are recorded and available on YouTube to accommodate the large number of students who take it as Berkeley students and as non-Berkeley students. The course is also offered almost every year, so students can watch lectures and study the slides from previous iterations of the class.
So what did I suggest to DSP? I told them that it was probably best for me not to attend classes, but to watch the lectures on YouTube, so long as DSP could caption those videos.
Why did I do this? Because CS 267 has three factors that are essentially the death-knell for my sign language interpreting accommodations:
The material is highly technical.
The lecturer (Professor Jim Demmel) goes through the material quickly.
I am not familiar with the foundational topics of this course.
The last one was the real deal-breaker for me. Even in classes that completely stressed me out due to the pace of the lectures and lack of suitable accommodations (CS 288 anyone?), I still had the foundational math and machine learning background to help me get through the readings.
But for a class about lower-level computing details? I have to check Wikipedia and Stack Overflow for even the most basic topics, and I could not understand what was being discussed in lecture.
Thus, I will watch lectures on YouTube, with captions. Unfortunately, DSP said that they required at least a 72-hour turnaround time to get the captions ready2, and I’m also not sure who will make them. I think it would be hard for the typical captioner to caption this material. I suggested that using YouTube’s auto-captions could be a useful starting point to build a transcript, but I don’t know how feasible it is to do this.
I suppose I could fight and demand a shorter turnaround time, but honestly, YouTube’s auto-captions are remarkably helpful with these videos, since I can usually fill in for the caption’s mistakes. Also, the limiting factor in my progress in this course isn’t my understanding of the lecture material – it’s my C programming ability. Finally, I have other issues to worry about, and I’d rather not get into tense negotiations with DSP. For instance, I still regularly feel resentment at the EECS department for what I perceive as their failure to help me get acclimated into a research group. I am only now starting to do research where I am not the lead and can work with more experienced researchers, but it took so long to get this and I’m still wondering about how anyone manages to get research done. My research — and overall mood — has been a little better this semester, but not that much, compared to last semester. I don’t want the same feelings to be present for my opinion of DSP.
Hopefully this new accommodation system for CS 267 will go well.
I have now completed the first class sessions for all the courses I’m taking this semester.
And I’m relieved.
I’ve always found the first session to be the most awkward of all class sessions. The reason is that, due to my class accommodations, there are typically two sign language interpreters (or sometimes in the past, captioners/CART-providers) who show up to the class with me. They sit near the lecturer, so they’re impossible to miss.
Consequently, when it’s the first day of class, I sometimes get paranoid and wonder if students are constantly thinking about the extra people in the room. Or, worse, what if they’re repeatedly looking at me? After all, other students might be curious about who on earth might actually need such accommodations. When I think about this, my face feels a bit hotter and I sometimes wish I could hide and blend in like a “normal” student, for once.
That’s not to say I never want people to think about me. For instance, if I knew students were thinking something similar to: wow, that guy over there who needs sign language accommodations must be reasonably good at this material or possess ability to work extremely hard, given his inherent disadvantages, well then perhaps I shouldn’t feel so awkward.
Of course, the point is that I don’t know what other students think of me, so I default to a more pessimistic view.
The worst part about these first sessions is when the interpreting integration does not go seamlessly. When this happens, it’s usually because someone arrived late to class. One of the most awkward first class sessions for me occurred back in my sophomore year of college. I was taking intermediate microeconomics with about 50 students in it. The school administration gave my interpreters the wrong room number, and I had failed to notify them after only recently finding out myself.
This meant that the interpreters showed up five minutes late to the first class, after everyone got seated. They caused a brief interruption, with one interpreter telling me what happened, and the other one introducing themselves to the professor.
Yes, that was pretty awkward. My face was a little red and I kept my eyes firmly focused on the board, hoping that the other 50 students wouldn’t look at me for more than few seconds.
Don’t misunderstand what I’m saying – there are times when I really like the attention. For instance, as I’ve stated a few times in this blog, I enjoy giving talks (e.g., project presentations), so I like the attention in those cases.
I just don’t like being highly visible when it’s the first day and a bunch of students who don’t know me have to suddenly get used to the interpreting services in the class.
In addition to bearing the initial awkwardness over the accommodations, I have a few other first-day concerns. One is that I know I have to arrive early to classes to make sure I can get a seat in the front row of the class, preferably at one of its “ends” since that results in the optimal positioning for me and the sign language interpreters (and probably for the other students; I don’t want to know how annoyed they’d be if the interpreters sat in the center of the room).
Due to the enrollment surge in graduate-level EECS courses, if I don’t manage to quickly secure one of those coveted front-row seats, then I probably have sit or stand near the front corner. For me, it’s better to stand near the front than sit in the back, but fortunately I’ve never had to weigh that tradeoff. In all my classes this semester – in fact, in every class I’ve had in recent memory – I’ve always been able to secure a front-row seat, but it’s still a concern for me.
Fortunately, with the first class sessions behind me, things should improve. From past experience, after about four weeks, everyone seems to get used to the interpreters, and a few wonderful students and professors start socializing with them (and me!).
Furthermore, after the first class, it becomes clearer to me and the interpreters how to best position ourselves for maximum benefit. I’ve had to suggest changing our seats a few times.
All right, I guess what I really want to say is that I’m looking forward to my next few classes.
Two and a half years ago, I wrote a post about programming in Python. One of my tips was to use the Python shell, so that one can quickly test simple commands before integrating them in a more complicated project.
Fast forward until now, and my Python habits have changed substantially. One notable change I have made is to use IPython instead of the Python shell. For my usage purposes, the IPython shell has been a strictly superior version of the standard one due to the following:
- It includes TAB completion for functions. For instance, suppose I’m importing the numpy library,
and I want to create an array variable, which means I need the
arrayfunction. I start the IPython shell (by typing
ipythonon the command line), import the numpy library, and when I press the TAB key after
a = np.arr, I get the output:
IPython is smart enough to tell me which methods I might be interested in using! It’s a really nice feature, and I’ve found that it also works when one tries to autocomplete function parameters. In the standard Python shell, typing TAB just means … creating extra TABs.
- It makes it easier to fix for loops, which is handy because it’s really easy to make a mistake with loops. Consider the trivial example below:
In IPython, to fix the loop, I just need to press the UP key and it will load both lines of the
for loop. In the standard shell, the UP key would only return
print "hi"", forcing the user
to essentially retype the loop.
- It remembers commands from previous sessions, so I can exit an IPython session, do other stuff, then restart IPython, press the UP key, and it will give me the commands I used in my last session.
These three are the extra IPython features that have been most useful for my work.
I frequently use Python for work because it is a simple language that has lots of robust math, machine learning, and data analysis libraries. My favorite Python library is matplotlib, which is used for forming high-quality plots.
A few months ago, my workflow for using matplotlib was to write a script that first gets the data
into a matplotlib plot, and then saves it (using the
savefig(...) function). When I need to
make lots of figures, however, it gets cumbersome to manage them, and I often have to keep multiple
images open so I can spot-check their changes when I re-run my script (e.g., if I modified the font
size of the text).
Fortunately, I discovered Jupyter Notebooks. These are brower-based platforms that make managing matplotlib-based images far easier by keeping information unified in one screen.
To start a notebook session, I type in
ipython notebook on the command line, which opens up a
web browser (for me, it’s Firefox). I then click New -> Python 2 to start the session. For a basic
plot, I can start by importing the library:
import matplotlib.pyplot as plt, but then —
crucially — I use the
%matplotlib inline command. The reason for using that is so that when
I write code to plot, and then execute it with a simple SHIFT-ENTER, the image will appear directly
under that code cell. Here’s a simple example:
This is nice, but what if I want to change some plot setting? If these images are going to be in an academic paper, they better have labels and legends, among other things. With these notebooks, one can modify the text in a cell and regenerate the image; here’s an example with some common commands I use for my plots:
This example doesn’t quite show the benefit enough, but once projects get more complicated, notebooks are a valuable tool to keep data organized. Moreover, one can save a notebook session so that the next time it gets opened again, its plots remain visible on the webpage.
For those of you who use Python, I encourage you to check out IPython and Jupyter. They add on to what is already an awesome general-purpose programming language.
This link on Berkeley “By the Numbers” states that 73 percent of undergraduate classes have fewer than 30 students.
That statistic is (painfully) amusing for me to think about, because I’ve only taken graduate courses here, and none has had fewer than 30 students. In my class reviews, I frequently discuss enrollment, so let’s recap:
CS 280, Computer Vision was overenrolled and had people sitting on the floor in Soda 306. The course staff had to force undergraduates to drop the course.
CS 281A, Statistical Learning Theory had one of the largest (if not the largest) rooms in Cory Hall, and we still had people sitting on the floor during the first few lectures. This is despite how CS 281A was offered the semester before I took it. Most graduate courses never get offered in consecutive semesters.
CS 287, Advanced Robotics. This is the only class where I can get a precise picture of enrollment in previous years, since the CS 287 course websites list the final project presentation schedule and I can count the students. (The Fall 2015 edition is on Piazza, not the official website.) The Fall 2009, 2011, 2012, 2013, and 2015 classes had the following respective number of students give project presentations: 19, 36, 15, 48, and 58.
CS 288, Natural Language Processing was overenrolled at the start; the professor said in his introductory email that “Since there are 80+ of you interested in what is normally a 20-person class, I wanted to be clear about how we’re planning to handle enrollment […].” Even with students eventually dropping, I am almost positive we had well over 30 students, possibly over 40 remaining at the end of the semester.
CS 294: Deep Reinforcement Learning was overenrolled and the staff moved the room and offered two lecture times. In theory, deep reinforcement learning is just one “small sub-research area” of Artificial Intelligence, but in reality, it’s probably the most popular of those areas.
EE 227BT: Convex Optimization was also quite crowded, though I don’t know if enrollment was that much greater than in previous years, but I don’t think having 50-60 students should be the norm in a graduate-level course.
It should be clear that this is due to the growing popularity of computer science as a major and a graduate degree (this page provides some hard statistics on Berkeley’s CS enrollment). The result is that Berkeley and similar schools have had to drastically expand the size of faculty and lecturers, but I worry about what will happen long-term if enrollment abruptly declines, say in five years. I wasn’t old enough to understand the dot-com bust, but I think I may need to go and read some of the literature on that era to have a better idea if history is repeating itself.
As an alumnus of Williams College, I regularly get emails from my class officers requesting for donations to the college. These emails try to convince us to give money by including variations of: “Don’t forget the awesome memories you had at Williams. Please donate to support the experience of current students!” On the Williams website, it’s not hard to find testimonials of students saying that they have made a lot of friends and love the college. Many students have also told me this directly.
I wish I could agree.
It has now been a year and a half since I graduated from Williams. During our commencement, since the class size was small enough, the graduating students lined up to walk across the stage to get their diplomas. For some students, as their name was called, the audience roared with the sounds of their friends cheering and hollering, and Dean Sarah Bolton would have to smile and wait for the applause to die down before calling the next student.
When she called me, a blanket fell over the crowd. It was uncomfortably quiet. As I approached President Adam Falk to receive my diploma, I heard a faint scream out in the audience. I didn’t look there; I just took the diploma and went directly back to my seat, feeling a little sullen.
Later, my Mom asked me if I had heard her as I was walking on the stage.
To be fair, graduation wasn’t a complete embarrassment, even though it sometimes felt that way. Every now and then, I was able to find and talk to a few graduating students. I waved a bit, asked students about their post-graduation plans, and engaged in other polite conversations. I even managed to get in a few photos.
But deep down, I knew that I had failed on one of my two major goals before entering college. The first goal, which I achieved, was to do well academically and get in a good graduate school in the sciences. I did that, and while I never thought my field would be computer science, somehow I made it. It doesn’t hurt that computer science is a pretty hot field now.
My second goal was to make close friends.
Not acquaintances. Not one-and-done homework buddies. Not people with whom our communication would derive primarily from exchanging Facebook posts.
Real, close friends, people who I could count on for the highest-priority social events, people who I could comfortably hang out with outside of the college realm, people who I could really trust.
I was concerned about making friends before entering Williams, since I had been unable to do that in high school. (Most people from the high school who I stay in touch with nowadays are those who I would have known even if I had not gone to my high school.) To be fair, I was reasonably friendly with students from the Deaf and Hard of Hearing classroom in my high school, but my attempts to extend this to hearing students did not succeed. Given that I was the only deaf student at Williams, I was concerned.
Despite a lack of social skills, my first semester at Williams actually exceeded my expectations. For one of the few times in my life, I was surrounded by brilliant, talented students my age who were also extremely eager to get to know each other. During the first few weeks, I couldn’t believe how many times people would come up to me, unprompted, to say hello, relieving me of the burden to start an awkward conversation. My goodness, where was this my entire life?
Unfortunately, as the months, semesters, and years went on at Williams, I gradually realized that I was missing out on close friendships. I would occasionally find homework collaborators, gym partners, and irregular eating groups.
But when it came to the “real” social events, I was out.
Like in most colleges, Friday and Saturday nights are prime social hours at Williams, the times when students stick with their closest friends to go out to eat, have a party, or to just hang out (hopefully doing nothing illegal, but never mind). I usually spent Friday and Saturday nights in the computer science laboratory or in my dorm room. It’s not that I was turning down party invitations – I didn’t get them.
When I wandered around campus during these times, I regularly walked by large groups of hollering students, some of them drunk. I’m not going to lie – I really, really wished I could have been part of some of those groups, enjoying myself in the company of friends (but without the drunk part). I dreamed about this, replaying hypothetical social situations in my mind and pretending that I was the popular person in the center of the crowd, leading the group to their destination.
Unfortunately, the reality was that during the few times I was lucky enough to be with a group of students late at night, I generally did not enjoy those experiences. The reason is obvious. When the other students talked, I was unable to understand what they were saying. If I were really popular, it might be possible to have students who act as personal translators, but that was not the case.
It didn’t help that I had what I would call a “friendship ranking” problem. I could form a ranking of the top ten Williams students with whom I was friendliest. But I don’t think any of them would have me at a comparable rank on their lists; I would probably be around ten spots lower. Thus, during the prime social hours, those high-ranking students would socialize with people on top of their hypothetical friendship list. It’s what a rational human being would do. And, admittedly, I didn’t have much courage to ask people to do things together. I worried that I wouldn’t be able to understand what they said, or that I would inconvenience them.
During my second year at Williams, I had a series of stressful and unpleasant experiences in groups and parties. I consistently ran into the problem of being unable to hear what students said in group situations. Starting in my junior year, I resolved to never attend parties again. I was sick of showing up to these events myself, watching people roar and laugh at something mysterious, and then walking back to my dorm room by myself.
For a while, simply ignoring these events worked. I sometimes had nagging thoughts that I really was missing out on lots of fun and friendship, but for a while, I could hold thoughts about isolation and friendship at bay.
The Fall 2013 semester was when my mental barrier broke. My isolation truly began to hit home, and to make matters worse, it came during an incredibly stressful time of my life, when I had to write graduate school applications and work on research. During the start of that semester, my isolation consumed me. I constantly thought about it when I was completing homework, sitting in class lectures, eating by myself, and doing other activities. I was unable to focus in class and had trouble sleeping. I soon had enough of it, and left the campus for a weekend to recharge at home and to talk with an external counselor.
During the winter break, since my family lives within driving distance from the college, I remained at home, with the occasional foray to campus if I had a thesis meeting. I soon faced the reality that, while I was home, I didn’t get texts or messages from other students, asking where I was. I felt that students didn’t care about me.
I was disconnected from them.
Fortunately, the Fall 2013 (and winter 2014) debacle had a not-disastrous ending. Being away from campus helped me mentally recover (but didn’t help me make close friends). My grades were fine, and I caught up on research in the following semester, Spring 2014. I also felt better once I had gotten into more graduate schools than I expected, since I could look forward to starting a new social life at my next school, forever thinking about how to upgrade from “acquaintance” to “close friend.”
Despite feeling better in the spring, I still skipped all the major senior social dances, parties, and events. No one asked me to go, and I did not know anyone who I could confidently ask to go with.
Ultimately, I have mixed feelings about my Williams experience: generally positive for academics, generally negative for social. I have not donated any money and don’t plan to donate, though I might change my mind later. After graduating, I knew for sure that I wanted the Fall 2013 semester to remain the worst semester of my life. I had no desire to relive my constant concerns over isolation.
But then, the Fall 2015 semester happened.
I’m going to refrain from a final statement as to which of these two semesters was worse. Hopefully, after some more time passes, I can relax and judge the Fall 2015 semester with a clearer mind, like how American presidents are often evaluated more favorably far beyond their presidency, as compared to immediately after their last term.
The Fall 2015 semester, however, currently holds the edge in the title of “worst semester ever”. The culprit, if you haven’t figured it out already: isolation.
Almost all of my negative experiences, almost all of my sources of stress and depression, just like at Williams, can be traced back to that one single, simple concept.
My “isolation thoughts” reappeared in the summer, an ominous sign of things to come. During that summer, I was alone in my lab room, which has six desks but (at that time) had only three students, including me, and the other two had internships at Google and Microsoft. I can remember three times when I was not, strictly speaking, alone there: when one of the two students took a break from his internship to give me a much-needed “hello” while we had lunch (that day was great), when a random Master’s student came to install his computers in the room (but he never showed up again and I saw someone else move his computers later), and when two students from another research group installed a research computer in the lab (but their real office is in a different building).
Aside from those three cases, I can’t think of another time when I spoke with anyone else near my desk that summer. It should say a lot that I vividly remember these minor interactions (and what we talked about), because deep, memorable interactions are hard to get.
As I mentioned in another post with the prefix “Thoughts on Isolation”, the isolation I was experiencing in the summer gradually consumed me and hindered my ability to do work and to study. During the weeks before, during, and after my prelims (i.e., late August), I went through several days that I would call “lost days.” Here’s the definition: a “lost day” is one when I show up to my office at the usual time, stay there for eight to ten hours, but do not make any progress at all on work, because my mind is consumed with thoughts on isolation.
Here’s an example. Suppose I show up in the morning with the goal of understanding a dense, technical paper that might help me with my research. I read a paragraph, but then have a thought appear in my head: that there was a recently-published paper co-authored by three Berkeley graduate students that was all the rage in research meetings. Then I get disappointed that somehow these students got together and were able to – presumably – bounce ideas off of each other and collaborate in creating a high-quality paper. I think: Why can’t I have that experience? A minute later, I shake this thought off of my head and realize that I have to read the paper in front of me. Since I am distracted, I have start back at the paragraph I just “read.” Unfortunately, after re-reading that paragraph, another thought explodes in my mind. This time it’s about something different, perhaps I remember seeing three other graduate students eating lunch together. I think about this, frustrated that I don’t have a consistent eating partner, and then snap out of it again to try and get back to reading. Of course, I have to start from the beginning of that same paragraph, and so on …
These feedback loops were devastating, robbing me of any hope of making progress during those lost days. I tried desperately to escape the loop: calling my parents, walking around campus, going to cafes, lying down on the couch in the lab room, you name it. But none of these were able to completely get rid of the feedback loop.
If only I could make it to the prelims, I thought, then things would get better. Passing the prelims would give me confidence that I needed to regain my research productivity. The start of the semester meant that there would be more people around. Things would go better.
So much for that.
Despite an impressive performance on the prelims, the Fall 2015 semester was a disaster. If anything, I felt more isolated compared to how I felt in the summer. I was bombarded with signs that students were less isolated than me. I saw students in the same research group stick with each other, working together or hanging out. The fall also brought a new wave of accepted research papers, many of them involving groups of two or more graduate students and postdocs. It was hard to avoid knowing about these papers, as the information is readily available. Sometimes these papers are on graduate student homepages, but I try not to look at those anymore.
Looking at these groups of students, either together socially or together in a publication, made me feel frustrated. I longed to be part of those groups. I wanted to break out of my cycle of isolation. I wanted to feel happy looking at other people, not disappointed.
My mood did not recover from the summer. I would feel upset while sitting in class lectures, knowing that I was different from the other students. I repeatedly got angry at myself during (and after) lectures when I was unable to follow the sign language well enough to sufficiently understand what lecturers were saying. I tried to reassure myself, knowing that I would spend nights and weekends reading webpages and textbooks to catch up on the lecture material, but somehow that didn’t make me feel better.
There’s something else that happened this semester. Something I’ve been trying not to think about lately, without much success.
I would feel isolated and experience a slight twinge of resentment, whenever I heard, read, or thought about “diversity in computer science.” I kept thinking that “diversity” in the context of computer science means getting more women and racial minorities involved (well, not all racial minorities …).
When I search online about “being black in computer science” or other similar queries, I see articles such as this recent one from Stanford. One of the sections in that article says: A feeling of isolation, and it describes isolation from being a racial minority.
Wait, let’s read that again:
A feeling of isolation.
Oh, wow. You know, that might just describe how I feel on a daily basis.
I kept thinking throughout the semester that, whenever the topic of diversity in computer science comes up, it’s assumed that Caucasian and Asian males, such as myself, have few issues getting along with others and feeling included.
That is probably true for most of us, but I can state from personal experience that all the attention towards making women and minorities feel more included in computer science makes me a little frustrated. OK, sometimes more than “a little.”
To be clear, I’m not saying that I don’t have advantages from being a Caucasian and Asian male. I have never been racially insulted, or sexually assaulted. I have been driving for seven years and have never been pulled over by the police – ever. If I had a different body type, those aspects about my life might be different.
But on the other hand, suppose I were black and hearing. Then, wouldn’t it be possible for me to sit through a lecture and finally piece together a few consecutive sentences from the lecturer? Wouldn’t it be possible for me to follow the conversation in a rapidly-scheduled research meeting with five people?
Wouldn’t it be possible for me to enjoy being in a group?
I face challenges that are different from those of women and minorities, some of which will lead to similar conclusions (i.e., isolation). Unfortunately, I don’t feel like I have an outlet, some kind of real support group of students who might help me. And people won’t line up to hear my opinion.
Imagine me thinking about this over and over again. That was Fall 2015 in a nutshell.
I’m not saying that my first year at Berkeley was that great – it wasn’t – but I never regularly thought about how much I was detesting it here.
Eventually, as the semester progressed with more thoughts on isolation and a few more “lost days,” I finally tried to tell people explicitly that I needed help to combat isolation. That this semester was just taking too much of a toll on me. Earlier, I had told others that my graduate experience wasn’t that great, but I now had to downgrade it from “so-so” to “awful” to make things clear.
I don’t want to place the blame on anyone in particular. I don’t think there is anyone to blame, except the “system” as a whole. I believe this because one thing that hurt me was failing to make it obvious when I first arrived in Berkeley that (a) I was deaf, and (b) I needed help finding real collaborators.
While I do feel like things can move at such a glacial pace, at least there are people here trying to help me out. I’m extremely grateful to the ones who have not completely disregarded me, and have given me the opportunity to – as of today – have much more collaboration than I have had in my life. A new era begins now. I can’t waste this opportunity.
So will my story have a happy ending? (Sigh) I don’t know.
By now, it should be clear that 2015 was not the greatest year for me. It started off somewhat, kind of, reasonably well, but fell off a deep cliff during the summer and remained buried under a Mount Everest-sized pile of stress during the Fall 2015 semester.
I really hope 2016 will go much better.
I’ll keep this conclusion short. To everyone, my goodness, Happy New Year.
As the year 2015 wraps up, I’ve been reviewing my New Year’s Resolution document. Yes, I do keep one; it’s on my laptop’s home screen so I see it every time I start my computer. No, I unfortunately did not manage to accomplish anything remotely close to my original goals.
I did, however, read more books this year than I did in previous years. I was a committed gamer back in high school and college and I’m trying to transition from playing games to reading books in my free time (in addition to blogging, of course).
In this post, I would like to briefly share some thoughts on three of my favorite books I read this year: Guns, Germs, and Steel, The Ideas that Conquered the World, and (yes, sorry) The God Delusion.
Guns, Germs, and Steel
Guns, Germs, and Steel: The Fates of Human Societies, by Jared Diamond, is a 1998 Pulitzer Prize- Winning (General Nonfiction) book about, essentially, how human societies came to be the way they are today. It aims to answer the question: Why did Eurasians conquer, displace, or decimate Native Americans, Australians, and Africans, instead of the reverse?
The white supremacists, of course, would say it’s because Caucasians are superior to other races, but Diamond completely eviscerates that kind of thinking by presenting strong geographic and environmental factors that led to Eurasia’s early dominance. Upon the age of exploration, it was Europe which contained the most technologically advanced and most powerful countries in the world. (Interestingly enough, this was not always the case in the world; Australia and China had their turns as being the most advanced countries in the world.) That European explorers had guns were not the main reason why they conquered the Americas, though: it was because they were immune to diseases such as smallpox that decimated the native populations.
I learned a lot from this book. Seriously, a lot. The book was full of seemingly unimportant factors that turned out to have a major impact on the world today, such as the north-south shape of the Americas versus the east-west nature of Eurasia. While I was reading the book, I kept repeating to myself: wow, that argument should have been obvious in hindsight, an indication that the book was effectively supporting its hypotheses. I felt a little uncomfortable when Diamond had to add several disclaimers in the book that it was not going to be “a racist treatise” but unfortunately that text is probably still necessary in today’s world.
A negative effect of reading this book was that, since it deals with the growth of human civilizations, it made me want to play some Civilization IV, but never mind. This was a great book.
The Ideas that Conquered the World
The Ideas That Conquered The World: Peace, Democracy, and Free Markets in the Twenty-first Century, by Michael Mandelbaum is a 2002 book that reviews the state of Western values at the start of the 21st century. If one compares life today to what it was like during the Cold War and earlier, some of the most remarkable trends are that countries heavily prefer peace as the basis of foreign policy, democracy as the basis of political life, and free markets as the basis of economic growth. Mandelbaum explains how these trends occurred by providing an overview of how countries previously conducted internal and foreign affairs from 1800 to the present. He particularly analyzes the impact of World War I, World War II, and the Cold War on liberal values.
There are many interesting themes repeated in this book. One is that Germany and Japan serve as the ultimate examples of how previously backward countries can catch up to the world leaders by adopting liberal policies. Another is that there are three “dangerous” regions in the world that could threaten peace, democracy, and free markets: the Middle East, Russia, and China, since those countries wield considerable power but have not completely adopted liberal principles. (In 2015, with all the terrorism, oil, and migrant crises in the Middle East, along with America’s diplomatic tensions with Russia and China, I can say that Mandelbaum’s assessment was really spot on!) A third theme is that much of the world has actually become less peaceful after the Cold War, a consequence of how the core countries now have fewer incentives to protect those countries on the periphery.
Of the three books here, this one is probably the least well-known, but I still tremendously enjoyed reading it. I now have a better understanding about why there is so much debate over government size in American politics. The role of the government in a free market society should be to let the market function normally, except that it should provide a social safety net and other services to protect the worst effects of the market. How much and to what extent those services should be provided is at the heart of the liberal versus conservative debate. As a side note: I find it really interesting how “liberal” is related to the free market, yet the stereotype in today’s politics is that conservatives, not “liberals” as in “Democrats”, are the biggest free market supporters. That’s vastly oversimplifying, but it’s interesting how this terminology came to be.
Oh, I should mention that this book also made me want to play Civilization IV. Perhaps I should stop reading foreign policy books? That brings me to the third book …
The God Delusion
The God Delusion, by Richard Dawkins, is a 2006 book arguing that it is exceedingly unlikely for there to be a God, and that there are many inconsistencies, problems, and harmful effects of religion. This is easily the most controversial of the three books I’ve listed here, for obvious reasons; a reviewer said: “Bible-thumpers doubtless will declare they’ve found their Satan incarnate”.
Dawkins is a well-known evolutionary biologist but is even more well-known for being the world’s prominent atheist. In The God Delusion, Dawkins presents a spectrum of seven different levels of beliefs in God, starting from: (1) Strong theist. 100 per cent probability of God. In the words of C.G. Jung, ‘I do not believe, I know’ to (7) Strong atheist. ‘I know there is no God, with the same conviction as Jung “knows” there is one.’.
Both Dawkins and I classify ourselves as “6” on his scale: Very low probability, but short of zero. De facto atheist. ‘I cannot know for certain but I think God is very improbable, and I live my life on the assumption that he is not there’. I also agree with him that, due to the nature of how atheists think, it would be difficult to find people who honestly identify as falling in category 7, despite how it’s the polar opposite of “1” in his scale, which is very populated.
This book goes over the common arguments that people claim for the existence of God, with Dawkins systematically pointing out numerous fallacies. He also argues that much of what people claim about God (e.g., “how can anyone but God produce all these species today?”) can really be attributed to a one-time event, plus the cumulative nature of evolution. In addition, Dawkins discusses the many perils of religion, about how it leads to war, terrorism, discrimination, and other destructive practices. For an obvious example, look at how many Catholics have a negative and inflexible view of homosexuals and homosexuality. Or for something even worse, look at ISIS.
The God Delusion ended up mostly reinforcing what I had already known, and expresses arguments in a cleaner way than I could have ever managed. This brings up the question: why did I already identify as being in category 6 on Dawkins’ scale? The reason is simple: I have never personally experienced any event in my life that would remotely indicate the presence of God. If the day were to come when I do see a God, then … I’ll start believing in God, with the defense that, earlier, I was simply thinking critically and making a conclusion based on sound evidence. After all, I’m a “6”, not a “7”.
Dawkins, thank you for writing this book.
In computer science graduate-level courses at Berkeley, it is typical to have final projects instead of final exams. There are two ways in which these projects are disseminated among the students:
Class Presentations. These are when students prepare a five to ten minute talk to the class, using slides and other demos to state the project’s main accomplishments. Due to explosions in class enrollment (see my class reviews here for examples), time limits are strictly enforced, so presentations must be precisely timed and polished.
Poster Sessions. These are when students bring a poster describing their work. Usually, students create posters by stuffing lots of images and text in a power point slide (or other software). Then they print using their lab’s poster printer.
I’ve experienced both scenarios at Berkeley, and based on those I would strongly state the following to instructors: class presentations are better than poster sessions, and should be the method of choice for dissemination of final projects.
First, a class presentation means students practice a useful skill, one that they will likely need for their future careers. This is especially true for academic careers, and students taking graduate-level courses are far more likely to want academic careers than the average undergrad. For me, presentations are also a way that I can channel my humor, which isn’t immediately apparent to other students. A second, less important reason, is that in an age of exploding enrollment in graduate courses, it’s nice to be able to finally learn people’s names when they give class presentations.
One can, of course, learn names and project accomplishments in poster sessions, but this requires more effort and is challenging for people like me. I have lots of difficulty navigating my way through loud, noisy poster sessions filled with accents. I either resort to reading people’s posters (and not understanding much of it anyway due to time constraints) or going through the awkwardness of having a sign language interpreter with me (and having that interpreter struggling through accents and technical terms).
Poster sessions have other downsides that apply broadly, and not just to deaf students. For instance, poster sessions allow students to hide. What happens if students don’t manage to do much for their final projects? As I’ve seen happen in my classes, these students go to the corner of the room to avoid the spotlight. Presentations avoid this issue, unless students are willing to go as far as to even skip their presentation time. Some students who are nervous about public speaking might also want to hide. To most of them, I would respond: good luck convincing your future bosses to have you not do any presenting.
If class presentations force students to produce something that is worth presenting and force them to encounter their fears, then that’s probably sufficient reason alone to use them!
There are other downsides to having poster sessions. They cost more, creating a chasm between students who have access to fancy poster printers and those who don’t; the latter may have to resort to printing out ten pages of work and pasting them together in a poster. Furthermore, the posters that get printed are unlike to be used again, in the exact form. True, many conferences have poster sessions due to scalability issues, but class projects are not generally up to par with research projects, so students would have to re-print posters anyway. And that’s assuming that students are using class projects as the basis for future research, which isn’t always the case.
Class presentations are also superior to poster sessions in that they require less physical room. The presentations can be delivered in the same lecture room, while poster sessions force the course staff to go through the trouble of finding and reserving a large room (or hallway, as is the case for Berkeley).
Furthermore, the one “benefit” of poster sessions, scalability, does not stand up to a rigorous analysis. (If there are other benefits, please let me know because I can’t think of any.)
First, if the class size is so large that it approaches the enrollment of a popular academic conference, then would the course staff really have time to read the final reports? Remember, neither presentations nor poster sessions enable people to fully understand a project; for this, one has to read papers.
Second, with five minutes per presentation, the process goes by quickly, and it is also easier for the course staff to track progress. Also, with a large class, it is likely that students would be encouraged to form groups, drastically reducing the quantity of presentations. If there’s too many presentations for one class, the course staff should divide the class into groups.
Finally, scheduling presentations is not generally a problem even with many groups. Here’s a simple procedure: have a random draw to see who goes next. If the class requires a fixed schedule, then busy instructors should have their TAs form the order of presentations.
Unfortunately, the classes I’m taking next semester have historically used poster sessions rather than verbal presentations, but perhaps I could convince them to change their minds?
The third class I took this semester was Convex Optimization (EE 227BT), which was also my first time wading into electrical engineering. There are three convex optimization courses at Berkeley: EE 227A, EE 227B, and EE 227C. (Note: I say 227BT in this title because the course had a “T” for “Temporary,” but that should go away soon.) I did not take the first course, EE 227A, and I think that may have been a reason for my struggles in this class.
To do well in EE 227B, I think one needs to be highly skilled in the following two areas: linear algebra and problem solving. If a student lacks one or both of these skills, he or she is in serious trouble. For a linear algebra concept, consider this problem: for symmetric . We encountered this at the start of the semester and would see it over and over again. The professor, Laurent El Ghaoui, said: “If you didn’t immediately know that the answer to this was the maximum eigenvalue of , or , then run away to EE 227A. This is all linear algebra.” I did know that, in fact, but the class material was nonetheless very difficult for me to understand.
We had five problem sets, and I think they were among the hardest ones I’ve ever had, and also more challenging than those from CS 281A. After spending 30 to 40 hours on the first few homeworks, I realized I needed to seriously start reaching out to other students to get more than two-thirds of the homework done correctly, and I did do that this semester.
Each problem set contained three to five questions, each of which had some number of sub-problems. Their difficulty varied considerably, with some parts following directly from the definition of Cauchy-Schwarz, (not Cauchy-Schwartz … I don’t know why people keep misspelling that), and others requiring some ridiculously complicated insights. The hardest one was to prove Theorem 4 and Corollary 3 from Laurent’s paper Sparse Learning via Boolean Relaxations. Yes, we had to do that, and no, we were not given this paper reference and had to start some of that from scratch. I found out about this paper from another student. Also, the paper was published in 2015, so it must have been difficult since no one else did this until now. Setting the boolean relaxation problem aside, the homework questions were challenging but doable with some problem solving insights (one might need help for these, though), and they were brutally educational.
In terms of homework logistics, we had a paid grader who graded the homeworks, which is different from the previous iteration of the course (Fall 2014) when students had to self-grade their submissions. Note that Laurent’s EE 227BT website is (currently) incorrect; I think he recycles the same links for his classes, so some of it is out of date for the Fall 2015 edition. Our grader was surprisingly generous with points but did not offer detailed feedback and also took three or four weeks before providing grades. In part, this was because of the large class size. We had perhaps eighty students at the start before setting to fifty or sixty.
One of the “less-awesome” aspects of this class, in my opinion, was that we barely followed the projected outline. We were supposed to get five homework assignments, released every other Thursday, which meant we would get two weeks to do each assignment. However, because the lectures quickly fell behind from the outline, Laurent delayed the second homework by a week, which caused a few more subsequent delays for other assignments. This meant that homeworks eventually spilled over into time that was originally designated for us to do final project work. I think it would be best to design homeworks conservatively so that even if the lectures get delayed, there’s no need to put off the homework due dates.
We had a midterm, but that was also delayed, by a week. It was in-class for 80 minutes, open note (but not open laptop or Internet). It had three questions, each with multiple parts, and was out of 40 points total. Judging from the distribution of scores, I think most students got somewhere between 15 and 30 points. It was definitely a challenging midterm, but in retrospect, I thought it was fair, and was of higher quality compared to the CS 280 midterm.
The third part of our grade was based on the final project. We started final project discussions really early, in September! Almost from the beginning, Laurent designed lectures so that we would cover standard concepts (e.g., Lagrange duality) for 75 minutes, and then the last 5 minutes would be an open discussion of final project ideas. Despite the early focus of final projects in the lectures, in reality we didn’t have that much time to work on them due to the homeworks and midterm getting delayed and cutting into project time. I think the course staff should address this in future iterations of the course.
I worked in a group of four in my final project, where we investigated various properties of neural networks. We read a lot of research papers (the “literature review” that Laurent kept saying in lecture) and ran experiments using CAFFE and CVX. We wrote this up in a forty-page final project report. Going through and editing that at the end was a lot of work! A quick warning to future students: the project report date was set before RRR week, which I think is unusual for most graduate courses, which allow students to work on reports through mid-December.
In addition to a report, we had project presentations, which I was happy about since it’s fun to give talks. Not all students would agree with me. During the presentations, my sign language interpreters would comment on some of the students who appeared to be really nervous. To make matters worse, Laurent brought a hand-held microphone to the class, and about half of the students actually held the microphone when they were talking. No, I’m serious! And it’s not like we were on stage at Broadway — we were in a normal-sized classroom! I don’t like holding a microphone because it would make it completely obvious to the rest of the class that I was nervous about public speaking! I think Laurent had good intentions about bringing the microphone, but to future students, please don’t use microphones when talking.
When it was my turn to present, I put the microphone away after someone handed it to me (sorry, not using it!) and immediately started off with a planned joke. I told the class to pretend that Laurent and I were “trapped in a world that represents the loss function of the neural network.” (Don’t ask why!) I continued the story: I led Laurent to a local minimum, but he got angry and wanted the global minimum. I calmly responded that local minima are just as good as the global minimum in neural networks. I added a little acting and tried to cleverly alter my tone of voice. The class roared in laughter, and I think that was probably the most successful joke I have ever pulled off in a class presentation.
To wrap up my thoughts on EE 227B, I think it is similar to most classes I’ve taken in the sense that it is challenging, but very educational. I now feel like I have a much better understanding of concepts in linear algebra, especially those about norms, eigenvectors, and matrix decomposition. Many students who take this course do research in Artificial Intelligence fields, and EE 227B enables students to read AI research papers without getting bogged down by the notation and definitions. This was a huge problem for me when I first started to read machine learning papers a few years ago. I couldn’t even consistently remember what meant! Thanks to EE 227B, and some of my own independent linear algebra studying, I’ve cleared a lot of that initial “notation hurdle”.
Finally, to future students who are considering this class, the best advice I have is to make sure that your linear algebra skills are sharp. In particular, be sure you know about matrix norms, eigenvectors, and other forms of matrix decomposition (e.g., Singular Value Decomposition).
If you’re weak in those areas, then in the words of Laurent, “run away to EE 227A.”
I took Advanced Robotics (CS 287) last semester, which is the graduate level class that Pieter Abbeel teaches at Berkeley. You can view the course website here. Robotics is a vast, highly interdisciplinary field, so to restrict the focus, CS 287 is about the math and algorithms of robot systems. No, we didn’t see giant, science-fiction style robots battle each other, but we did observe a research robot tie knots (alas, through videos, not in real time).
Before the class even began, I could tell we would have some logistics issues. Like almost every course I have taken at Berkeley, CS 287 was substantially over-enrolled at the start; we had perhaps eighty students before settling down to about sixty at the end. According to the CS 287 websites from previous years, it looks like the Fall 2009 and Fall 2012 courses had nineteen and fifteen students, respectively. Yeah, welcome to the new normal.
Due to the class size, Pieter actually provided two different lecture times, one in the morning and one in the afternoon, and I suspect he also convinced John to do the same thing for CS 294-112. Pieter did this to get to know the students better. During some of the class breaks, he would ask a handful of students to introduce themselves to everyone. Since I sat in the front corner of the room for optimal use of sign language interpreting services, I was called on first. From these introductions, I learned a few things from the class composition:
There were a lot of mechanical engineering graduate students. So much, to the point where I was complaining (er, joking) about this with my interpreters midway through a long sequence of mechanical engineers introducing themselves. It’s a good thing that no one else in the class (I think…) can understand sign language. (PS: to mechanical engineers reading this, I was joking so please don’t get angry.)
A lot of the students do not speak clearly! Many are quiet, have heavy foreign accents, or exhibit both qualities. The most egregious case resulted in my interpreter not understanding a single word a student said, which I mentioned earlier here.
A lot of the students did robotics research of some form, whether it was in computer science, mechanical engineering, electrical engineering, or a related field. Then I’m confused, is it just this year that robotics suddenly became popular? Or is it because CS 287 wasn’t offered last year and that this is the “overflow” year?
In terms of course material, CS 287 combined lectures on standard topics in artificial intelligence (e.g., optimization and probability) and on more obscure, robotics research subjects. The course lectures could be divided as follows: Markov Decision Processes, optimization, probability, and research. Overall, I felt that the lectures were polished and of high quality. Pieter seemed like he really knew the material and was able to offer many doses of intuition for some of the more technical material.
I discuss this in my other reviews, so I’ll continue the trend: how did the lectures mesh with sign language interpreting services? Pieter lectured at a fast pace, which was problematic for my two interpreters, who were often exhausted when their 20-minute shifts were up. On the positive side, Pieter spoke loud and clear, to the point where I actually think he’s one of the easiest people for me to understand. Consequently, relative to other classes, I did not have much difficulty in terms of identifying the exact words he uttered. It’s also somewhat ironic that he would be the one to mention to me about an ideal future where people had “virtual captions” projected out of their mouths, which displays the text they say in real-time. Yes, I would like for that to happen.
As an added benefit, the course slides contained a lot of information. In many cases I could understand a concept or a homework sub-problem just by reading the appropriate slides, which is really handy for a text-heavy person like me. Incidentally, while Pieter wrote a lot of math on a white board, in almost all cases it was math directly from the slides, and he was writing it out for intuition. Thus, taking hand-written notes is probably unnecessary for this class.
No course is without its hiccups, however, and I’d like to bring up a few points that may (or may not) matter to future students:
The difficulty of lectures varied considerably, which one can probably tell by browsing some of the slides. I thought the easiest class was the one on introductory probability. Since the material is quite rudimentary, I think that lecture needs to be eliminated in future iterations of the course. Basic probability is an ironclad requirement for understanding the math of robot systems. Other lectures were more complicated. The convex optimization and Kalman Filtering lectures would have been hard for me to follow had I not already had substantial exposure to those concepts.
Towards the end of the semester, we had a “project speed-dating” lecture, which is when we gathered in small groups and shared our progress on the final project. Ideally, students could get feedback and learn what others were doing. In reality, most students skipped this class, and I’m not sure how beneficial it was to those students who did attend (I didn’t benefit). Furthermore, we eventually had final project presentations. Thus, I think project speed-dating should be replaced with a “standard” robotics lecture.
We had three class sessions where guests from industry lectured about their companies. I’m neutral towards these, and would suggest that these only happen when Pieter (or another future instructor, if applicable) is traveling and unable to lecture.
CS 287 had four problem sets which involved math and MATLAB programming. I thought they were, on average, less challenging compared to problem sets in other classes. The math did not require incredible problem-solving skills, and I think they were designed to accommodate people from other fields (mechanical engineers …). For instance, the fourth homework asked to prove that covariance matrices are positive semidefinite, which is something that a lot of machine learning students can answer in thirty seconds. For the coding, we had to fill in MATLAB code in the designated “YOUR CODE HERE” sections. We got a lot of starter code for these assignments, so it’s relatively easy to understand how the code works in the overall pipeline.
To turn in homeworks, we used Gradescope, a company Pieter co-founded with Berkeley students. We only had to turn in PDFs of our answers, and the course staff can grade code-based assignments by spot-checking our plots. (Part of the reason why we had lots of starter code is because some of that is used to generate plots, which means that they are standardized across all student submissions.) We had page limits for our solutions, so be sure you know how to cram lots of figures together in LaTeX, such as by using minipages or subpages. Oh, I should mention: there are no solution sets to these assignments. I agree with Pieter in that there would be too much temptation for students to search for old solutions. Well, I wouldn’t search, but I’m not sure about others.
In addition to regular homeworks, we had four (!) optional extra credits, plus the final project. I only did one of the extra credit assignments, so I don’t have much to comment on those.
For my final project, I worked on a deep learning project about Atari game play, but my project ended up relating more to human learning since I analyzed data from humans playing Atari games on Amazon Mechanical Turk, and I ran out of time to integrate my findings with a Q-Learning agent. Pieter was the one who suggested this project. In fact, back in October, he and the two GSIs actually met with every project group in the class for five minutes to discuss the final project. Then, a day later, I assume Pieter sent out personalized emails to every group with project suggestions. That must have been a lot of work!
Just like in CS 280, we had project presentations, not a project poster session. That is a good thing. Single-student groups presented for 5.5 minutes. I tried to be funny by sprinkling in four jokes in my talk, and went so far as to put in a picture of Bernie Sanders in one of my slides. Unfortunately, I think my Sanders-related joke backfired since a lot of the students were internationals or were not fluent in American politics, whereas I have very strong political beliefs.
We then had to write the usual report to wrap up the project. I will warn future students: the grading for the final project is somewhat stricter than the grading for homeworks, though admittedly I think it was hard to get a really low grade on the project. Thus, to get an A, try to get at least 90 percent of the homework points, and make up for lost points with the four extra credit assignments. Pieter really makes it clear how our grades are computed, which makes the process less stressful for students who care about grades. This is in contrast with some other professors, who might not even return grades for final projects.
In conclusion, I enjoyed CS 287 and would highly recommend it to future students. Again, if possible consider taking this class concurrently with Deep Reinforcement Learning or a similar two-credit class as they would reinforce each other.