Question: Finding good collaborators: Create a view (virtual table) called good collaboration that lists pairs of stars who appeared in movies. Each row in the table
Finding good collaborators: Create a view (virtual table) called good collaboration that lists pairs of stars who appeared in movies. Each row in the table describes one pair of stars who have appeared in at least 4 movies together AND each of the movie has score >= 75. The view should have the format: collaboration (cast_member_id1, cast_member_id2, num_movies, avg_movie_score). Exclude self pairs: (cast_member_id1 == cast_member_id2). Keep symmetrical or mirror pairs. For example, keep both (A, B) and (B,A).
Hint: SelfXJoins will likely be a necessary. After creating a view, list (cast_member_id1, cast_member_id2, num_movies, avg_movie_score) sorted by average movie scores from the view.
Movie-cast.txt
9,162652153,"Hayden Christensen"
9,162652152,"Ewan McGregor"
9,418638213,"Kenny Baker"
9,548155708,"Graeme Blundell"
9,358317901,"Jeremy Bulloch"
9,178810494,"Anthony Daniels"
9,770726713,"Oliver Ford Davies"
9,162652156,"Samuel L. Jackson"
9,162655731,"James Earl Jones"
9,284442167,"Claudia Karvan"
9,162652385,"Christopher Lee"
9,425838884,"Peter Mayhew"
9,162652155,"Ian McDiarmid"
9,196103011,"Temuera Morrison"
9,770711854,"Trisha Noble"
9,444129912,"Wayne Pygram"
9,162691723,"Jimmy Smits"
9,364660718,"Bruce Spence"
9,162656296,"Frank Oz"
9,162714169,"Ling Bai"
9,770961398,"Warren Owens"
.
.
.
770876554,770916051,"Alec Wilson"
770876554,770773491,"Edmund Pegge"
770876554,770925843,"Noel Travarthen"
770972512,335716545,"Noam Chomsky"
movie-name_score.txt
9,"Star Wars: Episode III - Revenge of the Sith 3D",80
24214,"The Chronicles of Narnia: The Lion, The Witch and The Wardrobe",76
1789,"War of the Worlds",74
10009,"Star Wars: Episode II - Attack of the Clones 3D",67
771238285,"Warm Bodies",-1
770785616,"World War Z",-1
771303871,"War Witch",89
771323601,"War of the Worlds the True Story",-1
771243843,"Safe Haven: The Underground Railroad During The Vietnam War",-1
770784043,"Bride Wars",11
11292,"Star Wars: Episode IV - A New Hope",94
11366,"Star Wars: Episode VI - Return of the Jedi",79
.
.
.
770894512,"Nazis, The - Nazi War Crimes",-1
770916696,"WCW Fall Brawl 1995: War Games",-1
770949969,"Colors of War - Europe",-1
770972512,"Plan Colombia: Cashing In On the Drug War Failure",-1
prog3_createTable_sql.txt
#Create Table movies
create table movies
(
movie_id integer,
name varchar(1000),
score integer
);
#Load Data
load data local infile '~/prog3/movie-name_score.txt' into table movies fields terminated by ',';
#Create Table Cast
create table cast
(
movie_id integer,
cast_id integer,
cast_name varchar(1000)
);
#Load Data
load data local infile '~/prog3/movie-cast.txt' into table cast fields terminated by ',';
select count(*) from movies;
select count(*) from cast;
Below is a segment of my SQL code (from prog3_sql.txt) that is currently a work in progress. I have got it mostly set up, but whenever I run it, I get an error message saying " Duplicate column name 'cast_id' " as a result. I've looked everywhere for it but can't find it.
prog3_sql.txt
CREATE VIEW good_collaborations AS SELECT c1.cast_id, c2.cast_id, COUNT(c1.movie_id) AS tot_movie, AVG(m.score) AS tot_score FROM cast AS c1 INNER JOIN cast AS c2 ON c1.cast_id = c2.cast_id INNER JOIN movies AS m ON m.movie_id = c1.movie_id WHERE c1.cast_id < c2.cast_id GROUP BY c1.cast_id, c2.cast_id HAVING tot_movie >= 4 AND tot_score >=75 ;
SELECT * FROM good_collaboration ORDER BY tot_score;
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
