At the initial phase of experimentation and review of related work, I was focused on analyzing the relationship between the reasons to loan as stated by the lenders and the objective of each separate team. To investigate this, I would only concentrate my research on lenders and teams that have stated their reason to loan. Taking that constrain into consideration, the team data gathered from the Kiva API corresponds with the data of teams created within the same time space as the lenders in the dataset.
There are 11, teams within that space, making up a total of On average, a team has 32 members, with a standard deviation of and a median of 4 members. Figure 2 shows team logarithmic distribution by member count. There are two types of teams: those that are open for anyone to join, and those that require prospective members to be approved by the administrators.
Each type is thus identified as either open or closed. Which type of team membership contributes more to Kiva?
Closed teams in average fund loans, while open teams invest in , with a deviation of 4, and 13, loans respectively. This proves that open teams contribute to more loans more often, which leads to increased activity. Liu, Y. The next question would be revolving around the terms of the lent amount.
Which type of membership gives more per loan? Since the data does not reveal how much each lender gives with each loan, most Kiva-related papers make the assumption that each lender provided an equal amount to each loan. That is the same amount lent as for teams of the closed membership type.
In terms of the amount lent, there is no visible difference in the amount derived by membership type. This leads us to the same findings as other related work. There is a significant difference in the activity each type contributes to the ecosystem. Figure 3 shows the distribution of loans funded by each membership type. The distribution is similar, but the right tail of the open membership is longer, meaning more loans get funded by this type of teams.
Kiva should promote the creation of open teams. Some additional experiments performed on the Kiva data are shown in the following section. It includes clustering, dimensionality reduction and filtering. To support the main goal of producing more activity in the Kiva ecosystem by forming teams, we need to find a source of reasons from which teams my be created, hereby identifying similar lenders. One direct way of doing that is to investigate the loans and try to determine clusters of loans from which we may create a reason to lend.
Imagine that we identify clusters in which a certain sector is relevant, or perhaps a combination of attributes such as country and sector.
We could theoretically create clusters, and identify which reason to loan they would satisfy. From here, we would have to identify which teams or lenders match with these clusters in order to make a recommendation. To cluster the loans, I have implemented a Kmeans algorithm enhanced with principal component analysis.
Clusters are created using the loan purpose that was given by the borrower. This is an open text attribute where the borrower defines the use of the money. As I was preprocessing, I have created vectors of words representing each loan. Since we know that each loan is categorized into sectors, I set the number of cluster to 12, hoping to see a relation. However, the results show good cluster, but no relation to the sector.
As we can see in figure 4 below, the clusters created using the algorithm are not at all the same as the actual clusters created by the actual sector. The figure shows clusters of loans created by the algorithm in the first column. The second column shows the loans colored by sector. In the top left graph we can see that the two main principal components cluster the loans in a good way.
This effect is not aligned to the sector of each loan. With the same idea of identifying clusters, I reproduced the experiment over teams. The attributed reason to loan is stated by the teams, which is also an open text field. Unfortunately the clusters created for teams are not as good.
Teams may be seen as filters, if there is one for fishing, having two or more is unlikely. Arranging teams together in clusters is a harder task. Figure 5 shows the teams clusters. The following figures show some of the centroids of the clusters that showed a clear objective.
As mentioned before, the languages in the dataset vary, with English being the most common language, followed by Spanish. Out of the twelve clusters, 7 of them are English and 5 are in Spanish. The complete set of images can be seen in the appendix. Figure 6 shows Cluster number 9, with the most relevant words for the cluster being clothing related items.
It has the type of cloth for women, men and children the type of article, such as pants, shirts, shoes, blouses, and so on. In addition, the cluster has some food related words like lard, rice, milk and other ingredients. This cluster would suggest creating teams and finding lenders that are interested in providing clothing and increase food inventory for business owners.
Another interesting cluster is number 12, which refers to housing. It also includes some business related terms for real estate.
From this cluster we can identify users that may want to fund infrastructure, housing and real estate projects. One of the Spanish related clusters is shown in Figure 8. This cluster is all about food, and since Kiva finances projects, we can think that these clusters aim to building inventories, creating a convenience store or even setting up a restaurant.
This approach could yield good results if we are able to identify a reason to fund within teams or lenders that corresponds to these clusters. For future work, it would be helpful to optimize this approach in order to create more possible purposes and identify teams and lenders that have funded similar loans in the past.
Since both lenders and teams provide open statements about their own motivation to lend, based on related work previously mentioned, this suggests that teams increase activity in the ecosystem. I decided to review similarities between the two, as to be able to enhance or create recommendations. If it is easy for a lender to explore teams, and is reasonable to think that they would join the one that is most similar to them, we should expect to see a natural match between lenders in a team.
All participants are encouraged to post everything in English, so that exposure to lenders is greater and becomes more effective. However, not every team and every lender states their purpose in a single language. The data set has most of the information in English, but it also has Spanish, French, and Portuguese, amongst other languages. There may be a problem translating each text to English in case expressions are being used that can not be translated directly.
For that reason I decided to run the analysis using original languages. To measure the similarities I implemented tf. The approach to follow is the same as used search engines use. In first place a dictionary is built, based on the collection of documents to be indexed. When a new document is added, the dictionary is updated.
When a query is executed, the dictionary is used to determine which document is most similar to it. I do not need to update the dictionary, since there are no new teams and I am matching lenders to teams based on their reason to loan. Finally, the query would put forward the reason for each lender to lend. To measure the similarities, I have used cosine similarity weighted by tf. Since the overall goal of this paper is to provide a team recommendation, a subset is created selecting lenders that meet the following constraints: a having provided a reason to loan and b are members of more than one team.
When applying these criteria, a total of 7, lenders are selected. The similarity of each lender was measured against each of the 11, teams, recording a vector with the most similar teams for each user.
If we predict only one team for a given lender, only 1. Another method to evaluate this approach is to produce as many recommendations as possible from the list of the most similar teams, by setting a similarity threshold. When doing this, overall accuracy can be improved significantly to 6.
Figure 9 shows the accuracy at different thresholds. This means that lenders do not select their teams based primarily on their reason to loan. There is no easy way for a lender to find his most similar team based on his or her reason to loan via the Kiva website. The only recommendations that are given are based on what other people search.
However, the team data analyzed in this paper is 6 years older and ten times the size of that what was reviewed by the authors.
Either selecting a team is coherent among similar users but has no significant relation to team goal, or lender behavior shifts in time. Choo, J.
Their evidence is the constant change on the leaderboard, where Kiva shows the top 10 teams based on the amount they lend and their new members. The proposed model is explained in the following section. It includes a description of data preprocessing, model construction and evaluation. The similarity analysis in the previous section demonstrated the need to gather additional information, as to be able to build a better model.
In order to attain this, I followed a content-based approach. For each lender, a vector with data describing its lending activity was created and used to predict which team a given lender would join. The data extracted for the final model includes the following data: number of loans funded, number of invitations send to fund a loan, geo-region of the user, occupation, registration date to Kiva, greatest, smallest and average loan amount, most frequent activity, sector, geo-region and field partner funded, and finally two vectors with the codes and frequencies of activity, sector, continent and partner of funded loans.
These final vectors have an equal size for each lender, since all codes of each attribute are included in the recording of the frequency in which the lender has contributed to such attribute. Registered lenders at Kiva have no interpersonal relations, and teams may be formed regardless of any preset conditions. For that reason I would expect that the selection of a team is independent among lenders. If this assumption is right, I may use Naive Bayes as baseline model with Gaussian distribution.
The attributes in the baseline are those that are related directly to each lender. The model grew because of potential improvements identified at each iteration and revision. The tuning process for the final decision tree model was done in two steps.
In the first step I trained several trees while changing 1 parameter at a time. The parameters that were tested are maximum number of features to use at each split, minimum samples at each split resulting instances for a leaf , and the maximum number of leaf nodes. The default value of each parameter was 4, and 30 respectively.
The accuracy of each tree was recorded. From the first step the parameter that produces the simples and more accurate tree was extracted, resulting in a maximum number of features of 4, maximum leaf nodes of 50 and the minimum samples at each split of were selected.
In the second step I followed a similar approach. Again training several trees, but this time the default value of the parameters was the one as identified in the first step. The best model is reached with 2 features at each split, with maximum leaf nodes of 20, and samples minimum of The final model tree is presented in Figure 12, producing 20 rules to produce a recommendation given a lender or a team profile.
I have shown that people who join Kiva as lenders and have since joined a team, do not select this team based on its purpose alone. Spread over a spacious 30sqm, the Extra Large Rooms are sufficiently large for an extended stay in Munich. Guests are invited to explore the city in one of the free MINIs or to rent a Schindelhauer bike, whichever takes their fancy.
Anyone looking for a quiet moment after a long day can stretch out on a cosy king size or twin bed and go online to share the events of the busy day using the free high-speed WiFi. The Swan Suite with its 46sqm is perfect for extended stays or special occasions, as it offers plenty of space and a hint of luxury. The name says it all, as majestic swans are depicted in various features of the design. The suite comes with a comfortable king size bed.
The generously appointed, 73sqm Peacock Suite is a haven of well-being. Equipped with a bathtub, rain shower and king size bed, it is the ideal choice for special occasions. The hotel is located opposite the Hauptbahnhof central train station at Bahnhofplatz 1.
A building with its own rich history: In the last century, it still served as a building for the postal service and a royal telegraph station.
Today, it is a meeting point for travellers and city explorers. Exactly where Munich is at its most raw, urban, and vibrant - at the heart of the city. The Boilerman Bar on the 1 st floor is the perfect place to socialise and meet friendly people. Siegfried, Ludwig, Viktor or Lotte will be awaiting the visitor for a leisurely bike tour through the neighbourhood. Guests can also explore the area with a test drive on four wheels thanks to our cooperation with MINI.
Either select the option when making your booking or just enquire at reception. The conference room Muschelkammer in the 25hours Hotel The Royal Bavarian is located on the 1st floor. The conference room comes with modern equipment and has space for up to 20 people. The two suite of the hotel are ideal places for breakout sessions. Reserved exclusively for hotel guests, the wellness area is located on the 4th floor with a delightful quiet terrace facing the atrium.
Equipped with a sauna and a spacious rest area, it is an inviting place to kick back and relax after a busy day in the city.
In some cases it is necessary to present a negative test or other proof upon arrival. Here we have briefly summarised what applies. First come, first serve.
0コメント